Reference

Docs.

What peoplesets generates, what knobs you have, and the exact shape of the parquet you'll read. Live-fetched from the API — if you see it here, it's in the engine.

Get started

Quickstart

Sign up for a free key, then make a call. Every dataset is a zip of four parquets plus a markdown summary.

1. Get a key

Drop your email at peoplesets.com#get-key. We mint it inline and show it once. Save it.

2. Make a call

Every endpoint takes Authorization: Bearer $K. The most common one is POST /generate-company.

POST /generate-company
curl -X POST https://peoplesets-api-production.up.railway.app/generate-company \
  -H "Authorization: Bearer $PEOPLESETS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "industry_pack": "tech_startup",
    "size": 500,
    "seed": 42
  }'

3. Poll until done

Jobs run async. Status goes pending → running → done. Default sims finish in 1–3 seconds; 10k-person sims take 20–40s.

curl -s -H "Authorization: Bearer $PEOPLESETS_API_KEY" \
  https://peoplesets-api-production.up.railway.app/jobs/$JOB_ID

4. Download the zip

curl -L -H "Authorization: Bearer $PEOPLESETS_API_KEY" \
  https://peoplesets-api-production.up.railway.app/jobs/$JOB_ID/artifacts.zip \
  -o peoplesets-$JOB_ID.zip
unzip -l peoplesets-$JOB_ID.zip
# employees.parquet  events.parquet  comp_events.parquet  recruiting.parquet
# summary.md         data_dictionary.md

Catalog

Industry packs

Curated configs that shape the dataset into a recognizable narrative. Pass the name as industry_pack on /generate-company.

Healthcare system

industry_pack: healthcare_system

3,000-employee regional hospital network. Heavy clinical operations, significant administrative tail, regulated comp bands, and low attrition characteristic of credentialed healthcare workforces (<8% annualized). US-only. Slower promotion cadence than tech; equity not material.

Retail chain

industry_pack: retail_chain

Regional retail chain ~5,000 employees split between HQ corporate and store-frontline staff. Frontline (L1/L2 in CustomerSuccess, Sales, Operations) dominates headcount and exhibits the high turnover real retail companies see (30%+ annualized at the frontline). US + Canada footprint with hourly comp dominant at the junior levels.

Tech startup

industry_pack: tech_startup

Series-B / early Series-C software company. ~200 employees, heavily engineering-weighted (>40% of headcount), US-headquartered with a small international presence, aggressive promotion cadence, equity-rich comp at L4+. Standard 3-year simulation produces a clear hyper-growth signature followed by typical scaleup attrition.

Catalog

Scenarios

Narrative events layered onto a base sim. Use special_events: ["name"] on /generate-company, or call POST /apply-scenario with a single scenario for the shortcut path.

Reduction in force

scenario: rif

A mid-sim involuntary headcount reduction concentrated in the lowest-performing 2–4% of the company. Produces clear `layoff` events in the events table and a step-down in active headcount.

ParamTypeDefaultDescription
pctnumber0.03Fraction of headcount cut (0.01–0.10).
quarterstringQ3Quarter to land the cut (Q1/Q2/Q3/Q4).

Hyper-growth ramp

scenario: hyper_growth

Doubles the base hiring rate and accelerates org expansion. Use this for a 'we just raised a Series C' narrative — the events frame shows visibly more hires in the second half of the sim.

ParamTypeDefaultDescription
multipliernumber2Hiring-rate multiplier (default 2.0).

Merger or acquisition

scenario: m_and_a

Mid-sim acquired team appears as a burst of hires from a single department, plus a follow-up wave of integration-driven attrition. Use for post-acquisition demos.

ParamTypeDefaultDescription
acquired_sizeinteger50Headcount of the acquired team.

Distressed company

scenario: distressed

Elevates voluntary attrition across the board (recession-grade exit pressure) and suppresses promotion velocity. Use for 'retention crisis' demos.

Leadership shake-up

scenario: leadership_shake_up

Mid-sim involuntary terminations concentrated at L6+ (VP and above), followed by a backfill wave from external hiring. Use for 'new CEO swept the bench' narratives.

Schema

Data dictionary

Four parquet files per run. Schema is additive — new columns appear, existing columns never change shape, so downstream code keeps working.

employees.parquet

Employee roster with demographics, compensation, performance, and org hierarchy.

ColumnDescription
employee_idUnique identifier for each employee (e.g., EMP0000001)
full_nameEmployee full name (synthetic)
gradeJob grade/level (L1-L8)
departmentDepartment name
is_managerWhether the employee is a people manager
genderGender identity
ethnicityEthnicity/race category
age_rangeAge bracket (e.g., 25-34)
locationOffice location
hire_dateDate the employee was hired
base_salaryCurrent base salary in USD
current_performanceMost recent performance rating (1.0-5.0)
current_engagementMost recent engagement score (1.0-5.0)
job_titleJob title based on department and grade
statusEmployment status: active or terminated
termination_dateDate of termination (null if active)
termination_typeVoluntary or involuntary (null if active)
termination_reasonReason for termination (null if active)
manager_idEmployee ID of direct manager (foreign key -> employee_id)
promotion_countNumber of promotions received
last_promotion_dateDate of most recent promotion
last_merit_dateDate of most recent merit increase
emp_typeEmployment type (permanent)
fteFull-time equivalent (1.0 = full time)
countryISO3 country of employment (USA, GBR, CAN, IND, DEU, AUS, BRA, IRL)
cityCity of employment within the employee's country
currencyLocal-payroll currency for the employee's country (USD, GBP, EUR, INR, ...)
org_level_1Top-level org hierarchy label
business_unitBusiness-unit rollup of the employee's department (Technology, Go-To-Market, G&A)
cost_centerCost-center code derived from family + country + department (e.g., CC-ENG-USA-42)
job_familyCanonical family key from the job architecture (Engineering, Product, Sales, ...)
job_levelInteger level 1..8 within the job family
trackCareer track: IC (individual contributor) or Manager (people-leader)
job_codeStable internal code combining family, level, and track (e.g., ENG-03-I)
name_localeCountry code whose name pool generated the employee's full_name
local_salaryBase salary expressed in the employee's local currency (base_salary * geo multiplier)
equity_grant_valueUSD value of outstanding equity grant; populated for L4+ employees, otherwise null
tenure_daysDays employed (calculated at simulation end)
tenure_yearsYears employed (calculated at simulation end)

events.parquet

Chronological log of all workforce events (hires, terminations, promotions, reviews, etc.).

ColumnDescription
event_idUnique event identifier (e.g., EVT000000001)
event_typeEvent kind: hire, termination, promotion, perf_review, merit, reorg, layoff, hiring_freeze, leave_start, leave_end
employee_idEmployee this event pertains to (foreign key -> employees.employee_id; 'ORG' for company-wide events)
event_dateDate the event occurred
gradeEmployee grade at time of event ('ALL' for company-wide events)
departmentEmployee department at time of event ('ALL' for company-wide events)
sourceHiring source (for hire events): growth, ramp, backfill, founding
salarySalary at hire (for hire events)
typeTermination type (for termination events): voluntary or involuntary
reasonTermination reason (for termination/layoff events)
old_gradeGrade before promotion (for promotion events)
new_gradeGrade after promotion (for promotion events)
old_salarySalary before change
new_salarySalary after change
performancePerformance rating (for review events)
engagementEngagement score (for review events)
prior_performancePerformance rating before event
from_gradeGrade before promotion
to_gradeGrade after promotion
leave_typeCategory of leave (for leave_start events): parental_or_medical
reorg_idIdentifier of the reorg this event belongs to (e.g., REORG-001)
affected_countNumber of employees affected by a company-wide event (reorg, layoff)
layoff_pctFraction of headcount targeted by a layoff event header
duration_weeksLength of a hiring-freeze window in weeks
end_dateEnd date of a hiring-freeze window (ISO format)

comp_events.parquet

Compensation change events with before/after salary data.

ColumnDescription
comp_event_idUnique compensation event identifier (e.g., CMP000000001)
employee_idEmployee this event pertains to (foreign key -> employees.employee_id)
event_dateDate the compensation change occurred
event_typeType: merit_increase, promotion_increase, market_adjustment
old_salarySalary before the change
new_salarySalary after the change
change_amountAbsolute dollar change (new - old)
change_pctPercentage change as decimal (e.g., 0.04 = 4%)
gradeEmployee grade at time of comp event
departmentEmployee department at time of comp event

recruiting.parquet

Recruiting funnel data for each hire, from requisition open to close.

ColumnDescription
requisition_idUnique requisition identifier (e.g., REQ0000001)
employee_idHired employee ID (foreign key -> employees.employee_id)
gradeGrade level of the position
departmentDepartment of the position
open_dateDate the requisition was opened
close_dateDate the requisition was filled
time_to_fill_daysDays from open to close
sourceRecruiting source (job_board, referral, etc.)
applicationsTotal number of applications received
reviewedNumber of applications that passed initial review
phone_screensNumber of phone screens conducted
assessmentsNumber of assessments completed
interviewsNumber of interviews conducted
offers_extendedNumber of offers extended
outcomeOutcome of the requisition (hired)

Going deeper

API reference

For the full OpenAPI spec — every endpoint, every field, every response code — see the live Swagger UI. It's the source of truth.