OSINT is Changing … How Data is Gathered by Cyber Criminals

Large-scale data breaches are reported on a near monthly basis. Experian and Facebook are two well known data breaches that each affected millions of Americans.. But how do hackers make off with their data?

“Hackers don’t break in, they log in.”

That maxim has been echoed by hackers and IT professionals alike for years. This year at DEFCON 30, it was widely discussed in the various hacker villages. Contrary to what popular media shows, hackers are not finding ingenious flaws in code to get into most systems they access. Instead, they are using phishing, social engineering, and previously leaked credentials to gain system access. Once inside the network of a target company, they employ various methods to gather as much data as possible before disappearing. These data sets are sometimes retained by the groups that take them. More often, they are auctioned off on dark web identity forums. In some cases, breached data is simply left on paste sites on the open internet.

With that in mind, let’s examine two very different data breaches that were each facilitated by publicly available data and OSINT techniques:

Parler

In 2020, Parler was inflamed in the political turmoil gripping the nation. As a result, they became the target of ideologically driven hackers and OSINT practitioners. Parler’s security posture was so bad that no log in was needed to obtain massive amounts of user data. Parler had failed to create one of the most basic cybersecurity structures that any company housing large amounts of user data should employ, mainly securing their Direct Object Reference numbering. Posts were numbered sequentially, as they were received. This, paired with the lack of rate limiting and not requiring a user to log in to view content, made Parler extraordinarily vulnerable to large-scale anonymous scraping. 

This is exactly what happened in early 2020 when @donk_enby and her associates gathered and disseminated the entirety of Parler’s posting history. Video footage was of particular concern because Parler did not strip geolocation metadata from that content. This resulted in the exposure of more than 68,000 geolocatable points. These often correlated to poster’s home addresses, places of work, and even military bases (in at least two confirmed cases). More than a few people lost jobs or received threats at their homes as a result of this breach.

This breach was just basic social media intelligence gathering augmented by scrapers: OSINT at its most basic.

Office of Personnel Management 

This hack occurred in 2014 and continued into 2015. The Office of Personnel Management (OPM) had failed to institute many security reforms that other branches had instituted. A critical missed opportunity to improve their security was not having Two-Factor Authentication. In government, a chip-enabled Common Access Card is used to act as a second factor for verifying identity, along with a username and password. OPM had not instituted this capability by 2014, when the first network intrusions occurred. 

The government has not released how the unauthorized system accesses occurred. Most security experts suspect it was a password gained through phishing, social engineering, or some other means. We do know that in subsequent unauthorized accesses of OPM systems, the attackers used pilfered passwords  to gain access and install a backdoor in OPM’s network. These passwords came from two federal contractors who were routinely involved in security clearance paperwork. This allowed continued access, even after the credentials they had used to gain access were revoked. 

We will skip the technical details of what happened next. The resulting damage speaks for itself. In total, more than 22 million records were taken from OPM’s network. Millions of Security Clearance Questionnaires (commonly called SF-86) were taken. These documents contained occupational, psychological, medical, and relational data about members of the United States government who have access to sensitive information. Information about potential addiction problems, criminal records, and other embarrassing details were also lost in this breach. Nearly 5.6 million sets of fingerprints for cleared members of the government and military were lost as well. Assets working undercover might still be found by a simple biometric scan at airport customs, if they entered a country that obtained these data sets.

In 2017, data from the OPM hack was used to breach the Equifax credit union, showing another problem data breaches create: More data breaches!

Is there hope?

Data breaches are a constant threat to the identities of just about anyone. You do not have to have an online presence to be a victim. Experian and OPM managed to release data about people with no online presence. That doesn’t mean that nothing can be done.

Breach data is one of the many signals that 443ID provides to our customers. This allows our customers to harden their systems against potential access by compromised credentials.. 

In a future “OSINT is Changing” article we will examine a specific data breach and discuss ways people can determine if they are affected as well as countermeasures they can use to protect themselves.


  1. https://www.wired.com/story/parler-hack-data-public-posts-images-video/
  2. https://www.csoonline.com/article/3318238/the-opm-hack-explained-bad-security-practices-meet-chinas-captain-america.html
  3. https://abcnews.go.com/Politics/opm-hack-deeper-publicly-acknowledged-undetected-year-sources/story?id=31689059
  4. https://www.nytimes.com/2015/09/24/world/asia/hackers-took-fingerprints-of-5-6-million-us-workers-government-says.html
Steven Sheffield is a career intelligence professional with a background in special operations mission support, targeting, and OSINT collection. During his 20+ years in the industry, he has worked as both an analyst and a user experience researcher. This experience uniquely positions him to manage intelligence shops, design new analytic processes incorporating the latest technologies, and write AGILE user stories for developers building next generation OSINT products. He is currently a PhD candidate at Clemson University in Rhetorics, Communication, and Information Design.

Related Post