LinkedIn is one of the most valuable sources for job market data. If you're building a job aggregator, doing labor market research, or tracking hiring trends across industries, web scraping LinkedIn jobs with Python gives you direct access to that data at scale. LinkedIn actively blocks automated traffic, so you can't just fire requests at it and expect results. You need the right tools and the right approach to get consistent data without getting blocked.

In this article, we'll explore how to scrape LinkedIn jobs with Python, what tools to use, and how to avoid getting your requests cut off.

What You Need Before Starting

You need Python installed along with Playwright to handle JavaScript-rendered pages. LinkedIn loads job listings dynamically, so Playwright is the better choice over Requests and BeautifulSoup. Install it with pip and run the browser download command to get the binaries.

You also need a rotating residential proxy. LinkedIn is aggressive about blocking datacenter IPs, so a single static IP will get you blocked fast. Residential proxies cycle through real household IPs automatically, making your traffic look like it's coming from different real users. Proxyon offers residential proxies starting at $1.75/GB with no subscription required.

Finally, set up data storage. A CSV file works fine for basic job data. For repeated scraping at scale, SQLite is a cleaner option.

How to Scrape LinkedIn Jobs With Python

Once you have everything set up, import Playwright and pass your proxy credentials into the browser launch settings. Proxyon gives you a single endpoint URL with your username and password, so you plug those in directly.

Python

1from playwright.sync_api import sync_playwright
2
3with sync_playwright() as p:
4    browser = p.chromium.launch(
5        proxy={
6            "server": "http://residential.proxyon.io:8080",
7            "username": "your_username",
8            "password": "your_password"
9        }
10    )
11    page = browser.new_page()
12    page.goto("https://www.linkedin.com/jobs/search/?keywords=python+developer&location=New+York")
13    page.wait_for_selector(".job-card-container")
14    
15    jobs = []
16    cards = page.query_selector_all(".job-card-container")
17    for card in cards:
18        jobs.append({
19            "title": card.query_selector(".job-card-list__title").inner_text(),
20            "company": card.query_selector(".job-card-container__company-name").inner_text(),
21            "location": card.query_selector(".job-card-container__metadata-item").inner_text(),
22            "url": card.query_selector("a").get_attribute("href")
23        })
24    browser.close()

Navigate to the LinkedIn jobs search page with your target keyword and location. Use the wait_for_selector method to make sure job cards are fully loaded before extracting anything. From there, loop through each card and pull the job title, company name, location, and job URL using CSS selectors. Store each result as a dictionary and append it to a list.

After the loop, write the list to a CSV using Python's built-in CSV module. The whole script runs in under 30 lines. To paginate beyond the first page, increment the start parameter in the URL and repeat the process.

Final Thoughts

Playwright handles the dynamic rendering, rotating residential proxies keep your requests from getting blocked, and a CSV export gives you clean data ready to work with. The main thing that trips people up is sending requests too fast or using the wrong proxy type. Get those two things right, and you'll have a reliable scraper that runs consistently. Proxyon offers residential proxies at $1.75/GB with no subscription required.