How to Eliminate 27 Million Professionals From the Workforce
Illuminating the intricate web of data collection, algorithmic processing, and unconsented profiling in employment decisions conducted by private data brokers and automated hiring software vendors. This submission aims to provide clarity on how such systems harm due process, fairness, and the fundamental rights of job seekers, especially those unknowingly blacklisted, filtered out, or misrepresented through shadow profiling.
 I. INTRODUCTION: DATA HAS BECOME DESTINY
In modern employment practices, a significant and often opaque digital infrastructure now governs access to economic opportunity. At the heart of this infrastructure are systems powered by shadow data—massive volumes of personal, behavioral, and inferred data mined from the internet, commercial brokers, and passive surveillance.
 Shadow data refers to profiles constructed without the individual’s direct input or consent, aggregated from:
Web scraping
Social media APIs
Academic and résumé databases
Consumer spending patterns
Public records and employment databases
Surveillance technologies (e.g., keystroke tracking, browser fingerprinting)
When these fragmented datasets are synthesized using AI, they create risk scores, cultural fit models, and predictive personas that influence whether someone is deemed employable—even before a human ever sees their résumé.
 II. THE PLAYERS: WORKDAY & LIGHTCAST (FORMERLY EMSI BURNING GLASS)
 A. Lightcast.io’s Role in Data Brokerage
Lightcast, a leading labor market analytics firm, boasts access to over 1 billion job postings and mining of thousands of datasets related to workers’ job history, skills, education, and inferred career trajectories. Its tools integrate with HR tech like Workday, SAP SuccessFactors, and LinkedIn Recruiter, sharing labor market signals and even individual behavioral metadata.
According to Lightcast’s own website:
“We merge billions of historical job postings, online profiles, census, and education data into a unified labor market signal to inform talent acquisition.”
Much of this data originates from scraping sources like LinkedIn, GitHub, online résumés, and job boards—regardless of user consent. As described by the company, its real-time APIs push these insights directly into platforms like Workday Skills Cloud and HiredScore, creating dynamic profiles and predictive hiring filters.
B. Workday’s AI Stack and Integration with Brokered Data
Â
Workday’s AI-driven hiring tools—including Skills Cloud, HiredScore, and Persona Synthesis—incorporate Lightcast and similar third-party datasets. These platforms advertise their ability to "analyze over 625 billion data points", which include job seeker behavioral metadata, inferred capabilities, and “adjacent” skills not explicitly listed on an applicant’s résumé.
 "The system learns from millions of successful employees’ careers and maps those trajectories onto new applicants to suggest their future performance.” – Workday AI Product Page
The outcome is a model-based exclusion system:
Candidates are algorithmically removed from consideration due to perceived skill gaps, inferred instability, or statistical misalignment with corporate culture, even when they’ve never interacted with the hiring employer directly.
III. RESEARCH: SHADOW DATA’S CONSEQUENCES FOR JOB SEEKERS
The Scale of Harm
According to a 2022 Harvard Business Review study, more than 27 million qualified job seekers in the U.S. alone are filtered out by algorithmic hiring systems that rely on automated “fit” scoring. (Bessen et al., 2022). Many are rejected due to inaccurate or missing data in their shadow profiles, not based on actual applications or interviews.Disproportionate Impact on Vulnerable Populations
A 2023 paper by the Center for Democracy & Technology found that Black, disabled, formerly incarcerated, and neurodivergent candidates are disproportionately harmed by automated hiring tools that synthesize data through biased training sets and incomplete profiles.Absence of Transparency
Research from the AI Now Institute underscores that job seekers cannot meaningfully access or dispute these profiles. The data—often incorrect or outdated—becomes a “digital rumor” masquerading as a résumé, barring equal access to work.Legal and Ethical Failures
Few mechanisms exist to opt out of such profiling. Even under GDPR or CCPA, the right to explanation, correction, or deletion is rarely enforced or understood by users—especially when employers claim not to “control” the data but only license it from intermediaries.
IV. SOURCES OF SHADOW DATA: THE ENTIRE ECOSYSTEM
Shadow profiling relies on an expansive supply chain of data origins:
Source Type
Examples
Social Media & Résumé Sites
LinkedIn, GitHub, StackOverflow, Facebook, X
Education Records
College registries, MOOC enrollments, certificate APIs
Browser/User Tracking
Cookie tracking, heatmaps, behavioral marketing datasets
Financial Behavior
Credit indicators, purchase patterns, rent/payment histories
Data Brokers
Lightcast, People Data Labs, Acxiom, Oracle BlueKai
Employer Feedback Loops
Internal ATS rejections, resume scoring, turnover tracking
 These sources merge into centralized profiles, even across employers, leaving candidates unable to reset or contest their digital identity. The Lack of Consent is systematic, not incidental.
 V. DANGERS: UNVERIFIED DATA & BLACKLISTING
False Negatives: Qualified candidates labeled as “low potential” or “non-strategic.”
Reputation Traps: Once labeled as a “job hopper” or “flight risk,” future employers are algorithmically discouraged from engaging with the applicant.
Lack of Human Review: Automated sorting means hiring decisions are made long before a recruiter sees the name, increasing the risk of bias and exclusion.
 VI. CONCLUSION: CALL FOR TRANSPARENCY AND DATA MINIMIZATION
The entanglement between AI, shadow data, and employment access presents a clear and present danger to the fundamental right to work. The Amicus Curiae urges the Court to recognize:
The lack of meaningful consent in the current system;
The discriminatory effect of unverified profiling;
The need for auditable, explainable systems in employment tech;
The urgent necessity of discovery access for applicants challenging unfair digital evaluations.
 Only through judicial scrutiny and regulatory transparency can we prevent a future where workers are nothing more than datasets in a black box.
 V-A. THE FULL SCOPE OF SHADOW DATA USED IN EMPLOYMENT SCREENING
Contrary to common assumptions, shadow profiles used in employment contexts draw from a broad spectrum of highly sensitive personal data—far beyond traditional résumés or social media accounts.
 These datasets include:
Data Category
Examples and Risks
Demographic Data
Race, ethnicity, gender, disability status, sexual orientation, zip code–derived proxy variables
Health-Related Data
Mental health indicators, inferred disabilities, behavioral patterns from wearable devices or web searches
Education & Credentials
MOOC records, professional certifications, degree verification, academic citations
Employment History
Prior job titles, gaps in employment, inferred performance (even from former ATS systems)
Psychographics
Personality profiling, inferred attitudes, emotional tone (from emails, social posts, behavioral data)
Web Behavior & Metadata
Time spent on job sites, online application abandonment, IP/geolocation metadata
Financial Data
Credit risk proxies, purchasing habits, transactional histories, inferred socioeconomic class
Reputation Signals
Ratings on gig economy platforms, community engagement, blacklist status from prior employer systems
Much of this data is aggregated without consent and fed into hiring filters that evaluate a candidate’s supposed "fit," "stability," or "risk" based on experimental probabilistic models—not human judgment.
This data is routinely shared between brokers like Lightcast.io, People Data Labs, Acxiom, and platforms like Workday, SAP SuccessFactors, and Oracle HCM Cloud, where it is used to construct AI-based hiring decisions without notification or transparency to the job seeker.
V-B. HOW LARGE LANGUAGE MODELS ARE EMBEDDED INTO ENTERPRISE HR SYSTEMS
Large language models (LLMs), including those developed using transformer-based architectures like GPT, BERT, and T5, are now deeply embedded within recruitment, HR, and hiring ecosystems. They are not standalone tools; they are subsumed into platforms via APIs and SDKs, then licensed and distributed at scale.
1. Embedded LLMs in Platforms like Workday
Workday integrates LLMs to enhance resume parsing, job description generation, candidate Q&A, and sentiment analysis. These models claim:
Trained on millions of historical résumés, job postings, and interview transcriptions
Fine-tuned with employer-specific talent data (including rejected applicants, promotion histories, etc.)
Capable of synthesizing candidate profiles using inferred data and embedding soft indicators like tone, personality, or cultural fit
LLMs are also used to:
Match résumés to job descriptions using semantic similarity scores
Identify “adjacent” skills not explicitly listed
Predict potential employee trajectories and attrition risks
“Explain” why a candidate might be a poor fit—based on linguistic patterns and metadata
Distribution Across EmployersÂ
Workday markets these capabilities through its Skills Cloud and HiredScore integrations, which are built atop LLM infrastructure and available to:
Over 10,000 enterprise clients globally
Including Fortune 500 companies, government contractors, healthcare systems, and educational institutions
Clients may not be aware they are leveraging a centralized profiling infrastructure, as the AI features are abstracted into user-friendly dashboards and dashboards for recruiters. However, what they’re using are multi-layered LLM models trained on billions of datapoints, many of which come from public surveillance, scraped platforms, or third-party data brokers.
Notably, a Workday presentation in 2023 claimed it had access to and analyzed over 625 billion data points, used to “build contextual understanding” of job seekers beyond their explicit profiles.
V-C. LEGAL AND ETHICAL FAILURES IN TRANSPARENCY AND DATA PROVENANCE
For legal matters, it is advisable to consult a qualified lawyer who can provide specific guidance relevant to individual circumstances. Lawyers can assist in various areas such as criminal law, civil rights, family law, corporate law, and intellectual property, among others. It is important to choose a lawyer who specializes in the relevant area of law to ensure the best possible advice and representation. When consulting a lawyer, individuals should prepare all relevant documents and questions in advance to make the most of the consultation.
V-D. IMMEDIATE RISKS TO ECONOMIC PARTICIPATION
The convergence of shadow data and LLM-powered AI in hiring is not a future risk—it’s an ongoing civil rights crisis. Tens of millions of Americans may be pre-emptively filtered from economic opportunity due to misrepresented, outdated, or simply incorrect algorithmic interpretations of who they are.
False negatives can prevent marginalized populations from re-entering the workforce.
Candidates with medical conditions may be excluded through inferred disability risk scoring.
Blacklisted reputations—perhaps from a gig platform or social post—may follow them indefinitely.
No legal system ensures redress at scale.
Adjacent Skills
Adjacent skills refer to related or transferable skills that are not explicitly listed on a resume, application, or profile but are logically connected or inferred based on a person’s past experiences, job titles, industries, or training.
Definition of Adjacent Skills:
Adjacent skills are capabilities inferred through AI or logical association, based on an individual’s existing or past skills, experiences, or job roles—even though those skills were not directly listed or claimed by the person.
Examples include:
Inferring Excel or attention to detail from "Data Entry Clerk"
Predicting CRM software familiarity for a "Customer Service Rep"
Assigning project management skills based on "Team Lead" or "Developer" titles
Â
How Adjacent Skills Are Used by AI Systems AI hiring platforms, such as Workday Skills Cloud, HiredScore, and Persona Synthesis, routinely:
Use large language models (LLMs) and natural language processing to predict likely adjacent skills
Generate "shadow profiles" from brokered and scraped data
Evaluate job seekers based on both listed and inferred attributes
Include predictive traits like "volatility," "gap risk," or "culture fit" using adjacent or behavioral signals
Legal and Ethical Risks
Inaccuracy and Misrepresentation: Adjacent skills may not reflect true competencies.
Lack of Consent: Job seekers rarely consent to the inference of skills they did not report.
Disparate Impact: Inferred traits may replicate or worsen racial, gender, or age bias.
FCRA Violation: Inferred skills used in hiring decisions may trigger FCRA obligations if treated as consumer reports.
Title VII Discrimination: Disparate impact may occur if inferred data correlates with protected characteristics.