Skip to content
ignitai Get the app
← Back to blog · · 10 min read

Convert W-2 PDF to Excel on Mac (2026 tax-season guide)

Turn a stack of W-2 PDFs into one clean Excel sheet on Mac — on-device on macOS 14.4+, with every box reconciled and totals that match what you file.

guides tax mac bookkeeping

It is the second week of February and the W-2s have landed. One from your main employer, one from the job you left in March, one for your spouse, and — because you run a small shop — a stack of them you have to hand to your accountant for everyone on payroll. Each one is a PDF: a downloaded copy from ADP, a scanned paper form your accountant mailed, an emailed export from Gusto. And every one of them carries the same twenty-odd boxes, laid out in the same federal grid, that you now need as spreadsheet rows so the numbers can be summed, compared, and filed without re-typing a single dollar figure.

The single-form case is almost not worth a tool: you can read one W-2 and type Box 1 and Box 2 into your tax software. The case that eats an afternoon is the stack — multiple employers, multiple employees, or a few years of your own W-2s you’re reconciling for a mortgage or an amended return. They look identical at a glance and they are not: Box 12 has up to four coded sub-lines per form, the state and local section repeats once per state you worked in, and a scanned form introduces OCR noise exactly where the digits matter most.

This guide is the Mac-native workflow: convert W-2 PDFs to Excel on Mac, in batch, on-device on macOS 14.4+, with one row per form, every box in its own column, and totals that reconcile against what you actually file.

Why Mac (not iPhone) for a stack of W-2s

A W-2 often arrives on your phone first — a payroll email, a PDF in Mail. For a single form you want one number from, the phone is fine. But assembling a set of W-2s into a sheet is desk work, and the Mac wins for three concrete reasons:

  • Drag the whole folder at once. A two-earner household with a mid-year job change is four W-2s; a ten-person shop is ten; three years of your own filings is a small pile. A folder of PDFs is a Mac drag-and-drop, not a share sheet repeated per form.
  • The big screen makes box-by-box verification fast. The one check that matters — does the extracted Box 1 match the digits printed on the form — is a glance with the workbook on one half of a 27-inch display and the PDF on the other.
  • macOS 14.4+ on Apple Silicon keeps your SSN on the device. A W-2 prints your full Social Security number, your employer’s EIN, your address, and your full-year wages. On an M1-or-newer Mac running macOS 14.4+, the entire batch extracts locally — nothing is uploaded.

If your Mac is Intel-era (T2 or earlier), the on-device model isn’t available and ignitai falls back to a hosted pipeline with documented zero retention. For Apple Silicon Macs on macOS 14.4+, every SSN on every form stays on the machine.

Why a W-2 is harder to extract than it looks

A W-2 is a federal form, so “they’re all the same” feels true. The grid is standardized; the content is not, and a generic table extractor flattens exactly the parts that carry meaning:

  1. Box 12 codes. This is the box that breaks naive extraction. It holds up to four lines, each a two-character code plus an amount — D for 401(k) elective deferrals, DD for the cost of employer health coverage, W for HSA contributions, C for group-term life, and a dozen more. The code determines what the amount means. A coordinate-based extractor reads “12a, 12b, 12c, 12d” as four anonymous cells and throws away the codes that make them interpretable.
  2. Boxes 1, 3, and 5 are three different wage figures. Box 1 (federal taxable wages) is not Box 3 (Social Security wages) is not Box 5 (Medicare wages). Pre-tax 401(k) lowers Box 1 but not Boxes 3 and 5; the Social Security wage base caps Box 3 but not Box 5. They look like the same number when they’re close and diverge in ways that matter for reconciliation. Each needs its own column.
  3. The state and local section repeats. Boxes 15 through 20 (state, employer state ID, state wages, state tax, local wages, local tax, locality name) appear once per state you worked in. Someone who moved mid-year has two full rows of state data on one W-2. A flat extractor smashes them together.
  4. Scanned forms add OCR noise where it hurts most. Many W-2s arrive as scans or photos of paper. A 3 misread as an 8, or a misplaced decimal, lands in a wage box — the single place an error propagates straight into your return.

A tool that grabs “the table” hands you a denormalized grid where Box 12 codes are gone and the three wage figures blur together. A tool that understands a W-2 as a W-2 puts each box in its own labeled column, keeps the Box 12 codes attached to their amounts, expands the state section into proper columns, and gives you one clean row per form.

Method 1: ignitai on Mac (the on-device way)

ignitai treats W-2 extraction as a language task — “read this form and tell me Box 1, and the Box 12 code D amount” — not a grid-coordinate task. That’s why it survives the difference between a crisp ADP PDF and a slightly skewed scan without per-employer templates. The full Mac batch flow:

  1. Drag the folder of W-2s into ignitai. Every form in one drop. Mixed sources are fine — an ADP PDF, a Gusto export, and a scanned paper form go through the same batch. Up to 500 PDFs in one pass on an M-series Mac.

  2. Describe what to extract, once for the whole batch. Plain English. A prompt that works across W-2 layouts:

    “For each W-2 return one row: employee_name, employer_name, employer_ein, box1_wages, box2_fed_tax, box3_ss_wages, box4_ss_tax, box5_medicare_wages, box6_medicare_tax, box12_codes (as code:amount pairs), box13_retirement_plan (true/false), box16_state_wages, box17_state_tax, and the two-letter state in box 15. If a box is blank, leave it blank — do not guess or enter zero.”

    Save it as a preset. Next February it’s one click.

  3. Pick XLSX as the output format. One sheet, one row per form, currency formatting auto-applied to the dollar columns.

  4. Hit Extract. ignitai runs each PDF through the on-device model and streams rows into the workbook with live per-file progress. A folder of ten W-2s on an M2 Mac finishes in well under a minute.

  5. Review the consolidated sheet. Every row carries a source_file column with the original PDF filename so you can trace any number back to its form. Open in Excel for Mac or Numbers.

  6. Re-run failures, not the whole batch. A form that failed — a badly skewed scan, a password-protected PDF — gets listed separately. Fix it, re-run just that file, append.

Every SSN, every EIN, every wage figure lives on your Mac. Nothing is uploaded.

Method 2: a Python script (the DIY path)

If every W-2 comes from one provider with a stable PDF template and all are text-based (not scans), you can own the pipeline:

pip install pdfplumber pandas openpyxl

Then a script that opens each PDF, pulls the boxes by text position, and writes one consolidated sheet:

import pdfplumber, pandas as pd, re
from pathlib import Path

rows = []
for pdf_path in sorted(Path("./w2s").glob("*.pdf")):
    with pdfplumber.open(pdf_path) as pdf:
        text = "\n".join(p.extract_text() or "" for p in pdf.pages)
    # provider-specific anchors — this set matches one ADP layout only
    box1 = re.search(r"1\s+Wages.*?\s([\d,]+\.\d{2})", text)
    box2 = re.search(r"2\s+Federal.*?\s([\d,]+\.\d{2})", text)
    rows.append({
        "source_file": pdf_path.name,
        "box1_wages": box1.group(1) if box1 else None,
        "box2_fed_tax": box2.group(1) if box2 else None,
    })

pd.DataFrame(rows).to_excel("w2s.xlsx", index=False)

This works for one provider, one stable template, all text-based PDFs. It breaks the moment any of the following happens:

  • A form is a scan. Photographed or scanned W-2s return no text and pdfplumber gives you nothing. You’d layer in OCR, and accuracy drops exactly on the wage digits.
  • You need Box 12 codes. The regex count explodes — one pattern per code position per layout — and the codes are the part that’s actually hard.
  • Multiple employers, multiple layouts. ADP, Gusto, Paychex, and a county payroll office all position the boxes differently; your anchors silently stop matching and you get columns full of None.
  • The state section repeats. A two-state W-2 needs the regex to capture a variable number of state rows — the case where flat pattern-matching falls apart entirely.

For one-provider, text-based stability, the script is a one-time cost. Across scans, job changes, or a multi-employee shop, the maintenance burden eats the savings.

Method 3: Preview + Numbers (the no-install fallback)

For two or three clean, text-based W-2s you don’t want to install anything for: open each in Preview, select the box values, Cmd-C, paste into Numbers, fix the columns by hand, repeat per form. It works for a handful of identical, simple forms. It falls apart on scans, on Box 12’s coded lines, and on any form with a repeated state section. For more than a few forms, save the preset in Method 1 instead. The same trade-off plays out for paystubs in the paystub Mac guide.

Method 4: web converters (and why not for W-2s)

Generic “PDF to Excel” web tools exist, and a W-2 is precisely the document you should never put through one. You’d upload your full SSN, your employer’s EIN, your home address, and your full-year income to a server you don’t control. Free tiers gate at one to three files. And none are W-2-aware, so they collapse the wage boxes and discard the Box 12 codes. For a blank sample form, fine. For your actual W-2, no.

The reconciliation checks, on Mac

Once the sheet is written, four checks separate “I have a file” from “numbers I’d put on a return”:

  1. Box 4 against Box 3. Social Security tax (Box 4) should equal 6.2% of Social Security wages (Box 3), up to the annual maximum. Add a column: =box3_ss_wages * 0.062. It should match Box 4 to the cent until Box 3 hits the wage base, where Box 4 caps. A mismatch flags a misread in either box.
  2. Box 6 against Box 5. Medicare tax (Box 6) should equal 1.45% of Medicare wages (Box 5), with no wage cap. =box5_medicare_wages * 0.0145 should match Box 6 (plus an extra 0.9% on high earners). This is the cleanest single-row sanity check on a W-2.
  3. Box 1 vs Box 5 plausibility. Box 1 should usually be at or below Box 5, because pre-tax 401(k) deferrals lower Box 1 but not Medicare wages. A row where Box 1 exceeds Box 5 by a wide margin is almost always a column swap from extraction.
  4. Totals against what you file. Sum box1_wages across all of one person’s W-2s; that total is what flows to the wages line of the 1040. Sum box2_fed_tax; that’s the federal withholding you’ll claim. If you have contractor income too, this pairs with the 1099 consolidation flow on Mac.

Skip these and a misread 8 that should be a 3 surfaces as an IRS notice in the fall, when the wages the IRS has on file don’t match the return.

What you do with the Excel sheet next

The consolidated workbook feeds whichever job started this:

  • Filing your own return. One row per W-2 with Box 1 and Box 2 broken out means the wages and withholding lines on the 1040 are a SUM, not a re-typing exercise across four forms. You keep the PDFs as the source of truth alongside the sheet.
  • A multi-employee shop’s year-end. One row per employee lets the bookkeeper tie the W-2 totals to the payroll register — and the per-period detail behind each W-2 comes from the paystub Mac flow, so the year reconciles from stub to W-2 to filing.
  • A mortgage or amended-return packet. Lenders and the IRS both want wage history across years. A clean sheet with one row per W-2 per year, plus the source PDFs, is far easier to hand over than a folder of look-alike forms.

When this workflow doesn’t fit

Honest edge cases:

  • Your payroll system exports a W-2 register. ADP, Gusto, Workday, and Paychex all offer a CSV/Excel W-2 export in the admin portal. If you have admin access, use it — extracting from PDFs when the system will hand you a register is a workflow built backwards. The PDF path is for employees and accountants who only have the forms.
  • You need one number from one W-2. Don’t build a batch. Open the form, read the box, done.
  • Corrected W-2s (W-2c). A W-2c only shows the boxes that changed, in a previously-reported-vs-corrected pair. Tighten the prompt to capture both the prior and corrected amounts, and reconcile against the original W-2 row rather than treating it as a standalone form.

Bottom line

For a stack of W-2 PDFs that needs to become one Excel sheet on Mac — for your own filing, a small shop’s year-end, or a multi-year mortgage packet — install ignitai, drag the folder in, write the prompt once, pick XLSX, hit Extract, and run the four reconciliation checks. For a single-provider, text-based, no-Box-12 year you want to own end to end, a pdfplumber-plus-pandas script is a valid alternative if you don’t mind maintaining the anchors. For everything else — scanned forms, multiple employers, Box 12 codes, or anything carrying an SSN you’d rather not upload — the on-device Mac batch is the shortest distance from a pile of forms to numbers that match your return.

The parallel payroll flow (per-period earnings instead of annual totals) is in the paystub Mac guide; the contractor-income equivalent is in the 1099 Mac guide; and the broader mixed-document batch pattern is in the Mac batch guide. Same app, same presets, same on-device guarantee.

Get ignitai on the App Store — free download, $19.99/mo unlocks unlimited batch extractions after the 3-day trial.