Last updated: 5 May 2026 · Version 1.1 · UK GDPR Art 14 notice.
In plain English
PCRA maintains a directory of UK primary care practices so that research sponsors can find sites suited to a given study. To keep that directory accurate, we periodically read information that practices have already published on their own public websites (practice name, address, phone, opening hours, named clinicians, specialty interests). We never read behind a login, never take data marked private, and never collect patient information.
If you run a practice and would rather we didn't do this, scroll to How to opt your practice out.
What we collect
- Practice name, registered address, regional geography.
- Public contact details (phone, reception email, website URL).
- Named partners / GPs and their public job titles.
- Published services, specialty interests, research involvement.
- The URL and HTTP status code we retrieved the page from, plus a timestamp — so we can show you exactly what we saw and when.
What we do not collect: patient data of any kind, data that requires logging in to access, data markednoindex / robots: noindex, any content behind a robots.txt disallow rule that applies to our user-agent.
Legal basis (UK GDPR Art 6)
We rely on legitimate interests (Art 6(1)(f)) — specifically, the interest of running an accurate, UK-resident research network directory so that sponsor-funded studies can reach the practices best placed to run them. We have carried out a balancing test (available on request); in summary:
- Necessary: manual data entry at UK scale is not feasible, and sending a form to every practice would be more intrusive than reading a page the practice has already published.
- Proportionate: we collect only information the practice has chosen to publish, and we obey
robots.txt. - Reasonable expectation: anything on a public practice site is already indexed by search engines; our use is narrower and easier to object to.
For named individuals (e.g. a partner's name), we process only data the practice has published in a professional capacity on the practice's own site. We do not scrape personal accounts, social media, or consumer review sites.
How we scrape responsibly
- We identify ourselves with the user-agent
PCRAllianceResearchBot/1.0and respectrobots.txtrules that target it or*. - We rate-limit ourselves to one request every few seconds per domain and back off on
429/5xxresponses. - We never bypass authentication, CAPTCHA, or paywalls.
- Pages are re-fetched no more than once per month per practice in normal operation.
Who sees it
- PCRA staff and admins — to curate and correct the directory.
- Research sponsors — a sponsor can see your entry only if they are matched to your therapeutic area for a specific study. Sponsors never see raw scraped snippets — they see curated fields we have reviewed.
- Processors: data lives in our UK/EU infrastructure (Neon, eu-west-2). No US transfer. See our Privacy Notice for the full list of processors.
Retention
We keep the latest scraped snapshot for the lifetime of the directory entry. Historical snapshots are discarded after 90 days. If we are asked to remove an entry, the live record is deleted immediately; the 90-day snapshot archive is purged within 30 days.
How to opt your practice out
Any one of the following is enough. You do not have to do more than one.
- Email us at info@pcralliance.uk with the subject line “Please stop scraping [practice name]”. We will remove the entry within 5 working days and add your domain to our block-list so we don't re-scrape it.
- Add a
robots.txtrule on your site:User-agent: PCRAllianceResearchBot Disallow: /
Our next crawl will honour this automatically. - Tell us in writing at Primary Care Research Alliance, registered office (see Privacy Notice). We will treat postal letters the same as email.
Your rights
Under UK GDPR you can ask us to:
- Tell you what we hold about your practice (right of access — Art 15).
- Correct anything we got wrong (rectification — Art 16).
- Delete the entry (erasure — Art 17).
- Stop processing while we look into a dispute (restriction — Art 18).
- Object to any further scraping of your practice (Art 21). If you object we will stop unless we can show an overriding legitimate reason — we have never done this in practice and don't expect to.
Complaints
Our Data Protection Officer is the first point of contact; email daphne@pcralliance.uk. If we haven't resolved your complaint within a reasonable time you can raise it with the Information Commissioner's Office.
Changes to this policy
If we change how the scraper works in a way that affects the data we collect or who sees it, we will update this page and bump the version number at the top. Significant changes will also be announced on our news page.