Timehop Security Incident, July 4th, 2018
UPDATED ON JULY 11TH, 2018 10:12 (See main post here)
NEW TEXT IS UNDERLINED.
The following is intended to provide technical details for those with interest in the specifics of the information security incident Timehop has experienced. It is also to be transparent about what has happened, and correct some earlier inaccuracies. There are still some highly specific details we are withholding about an incident that remains the subject of ongoing investigations.
On December 19, 2017 an authorized administrative user's credentials were used by an unauthorized user to log into our Cloud Computing Provider. This unauthorized user created a new administrative user account, and began conducting reconnaissance activities within our Cloud Computing Environment. For the next two days, and on one day in March, 2018, the unauthorized user logged in again and continued to conduct reconnaissance. However, there was no Personally Identifiable Information available in the environment during those visits, so the attackers never took any data.
In April, 2018, Timehop employees migrated a database with personally identifiable information into the environment. The attacker saw this when they logged in on June 22, 2018. The unauthorized user then logged in again on July 4, 2018, when the database containing PII was stolen.
Here is the timeline of what happened. We have provided much detail here, but continue to withhold certain details for security reasons. We are providing information on the Internet Protocol addresses used by the attackers to law enforcement and the information security community.
Note, as we have stated, an entire database was taken, and that database included access keys to social media sites. Those keys were in a different table of the database, which contained no PII, and which we are therefore not disclosing. The remaining tables within the database contained data necessary for daily operations at Timehop and its support functions.
- [Employee]'s credentials were used to log into our Cloud Computing Environment from an IP that resolves to the Netherlands; we will refer to this unauthorized user as [The Unauthorized User].
- [The Unauthorized User] creates an API access key on [Employee]'s account and a new user '[account_name]', attached admin access to that account, and creates an API access key on that account as well.
- [The Unauthorized User] logs into [account_name] using the API and starts scraping the list of tables, accounts, roles, and alarms - in short, [The Unauthorized User] was conducting cyber reconnaissance. There was no Personally Identifiable Information available in the environment at this time.
- [The Unauthorized User] logs in again and conducts more reconnaissance.
- [The Unauthorized User] logs in again and does even more reconnaissance.
- The legitimate [Employee] legitimately creates the Users table in a database in preparation for migration, but it is not populated with data. To be clear, at this time, this Users table contained no data, and thus no personally identifiable information was present.
- [The Unauthorized User] logs in and does even more reconnaissance. The Users table in the database still contains no user data.
- The [Employee] legitimately migrated the Users table data into the database, generating personally identifiable information and data that would become the target of the upcoming attack. Until this date, 4/4/2018, there was no personally identifiable information present in the environment accessible by the attackers - while some may have existed in an ephemeral cache environment, access logs examined to date indicate the attacker never attempted to view the data in the cache.
- [The Unauthorized User] logs in again, does final some final reconnaissance of the Users database table. The attacker then discovered the user data (including personally identifiable information) in the table.
- 2:04 PM: [The Unauthorized User] logs in and begins to restore an existing snapshot of the database containing the Users table into a cluster they created, called "Reusers".
- [The Unauthorized User] continually checks alarms and monitors while the restore process is progressing.
The restoration of the snapshot has taken around 30 minutes and [The Unauthorized User] spun down the restoration process.
2:43 PM: [The Unauthorized User] resets the password to the production Users database.
2:43 PM: Internal alerts report the ‘Reusers’ database is available.
2:44 PM: [The Unauthorized User] initiates a deletion of the ‘Reusers’ cluster
2:49 PM: Internal alerts report the ‘Reusers’ cluster is closing down.
2:50~4:00 PM: Massive spike in DB reads of the Users production cluster is reported by internal alerting tools. Timehop application users are reporting black screens.
4:04 PM: Internal alerts report the service being down. A Timehop engineer investigates and tries to restart the database.
4:13 PM: The Timehop Engineer discovers that the password has been changed.
4:16 PM: The Timehop Engineer discovers that the database, while still password protected, is not behind a firewall and can be accessed by anyone with the password.
4:23 PM: The Timehop Engineer resets the password to the database, Services start to come back up.
- 6:09 AM: [The Unauthorized User] logs in using the API access key on [Employee]'s account and lists Cloud Computing Environment users, and logs out.
- 12:10 PM - Timehop engineers begin investigation into the event.
- 12:30 PM: Timehop engineers reviewing logs recognized suspicious patterns and conclude that we had been attacked. An incident is declared.
- 1:25 PM - Other Timehop engineers examining logs and artifacts confirm the incident, and began removing access to [The Unauthorized User]’s account and the [Employee's] account.
- 1:35 PM - Engineers begin enforcing MFA policy on all access to all accounts
- 2:02 PM - Cloud Computing Environment is considered Secure.
- 2:47 PM - COO notifies Federal Law Enforcement
- 2:55 PM - CEO contacts cyber incident response company
- 3:11 PM - CEO notifies Board of Directors
- 3:22 PM - CEO and Incident Responder begin discussing engagement
- 4:45 PM - Incident Responder arrives on scene and begins investigation
As you can see from the timeline above, on July 4, 2018, the attacker(s) conducted activities including an attack against the production database, and transfer of data. At 2:43 pm US Eastern Time the attacker conducted a specific action that triggered an alarm, and Timehop engineers began to investigate. By 4:23 PM, Timehop engineers had begun to implement security measures to restore services, however they still believed that they were dealing with a maintenance issue. They did not immediately suspect a security incident for two reasons that in retrospect are learning moments. First, because it was a holiday and no engineers were in the office, he considered it likely that another engineer had been doing maintenance and changed the password. Second, password anomalies of a similar nature had been observed in past outage. He made the decision that the event would be examined the next day, when engineers returned to the office.
The following morning, engineers discussed the issues and began to explore the previous day’s incident. Around noon, the engineers began the investigation that would lead to a security incident response and the locking down of the environment.
Incident Response & Communication Plans
Once we recognized that there had been a data security incident, Timehop's CEO informed federal law enforcement officials; contacted the Board of Directors and company technical advisors; and retained the services of a cyber security incident response company, a cyber security threat intelligence company; and a crisis communications company.
The beginning of the formal incident response, which began July 5th, was to examine the voluminous logs of all activities. This takes some time, and is iterative, even with data visualization tools. This preliminary understanding of the timeline and activities of the attackers brought home that we needed to immediately conduct a user audit and permissions inventory; change all passwords and keys; add multifactor authentication to all accounts that did not already have them for all cloud-based services (not just in our Cloud Computing Provider); revoke inappropriate permissions; increase alarming and monitoring; and perform various other technical tasks related to authentication and access management and the introduction of more pervasive encryption throughout our environment. We immediately began actions to deauthorize compromised access tokens, and as we describe below, working with our partners to determine whether any of the keys had been used.
Assessing the Damage
An assessment was conducted on the data that could have been exposed, and determinations made as to whether this data was actually exfiltrated. In contacting our social media provider partners, our goal was to determine whether any of the keys that we had determined had been taken had been used in any way, and to de-authorize all keys. The logistics around these critical activities were dictated by the process of conducting these queries through collaboration with our partners.
Law Enforcement and the Investigation
Simultaneously, company executives were conducting communication with local and federal law enforcement officials and information security professionals at our partners and within the retained information security firms to help us understand the impact of this incident on our users.
How We Approached This
The toughest decisions were those that disrupted services - such as access to Google Photos and Dropbox Photos. While we were confident that the access keys to those services had not been used, we felt that potential exposure of that content urgently justified a service interruption to ensure that attackers could not, for example, view personal photos. Through conversations with the information security, engineering, and communications staff at these providers, we were able to deactivate the keys and confirm that no photos had been compromised.
The CEO and COO agreed that the communications strategy should comprise two main values: first, communication with our users must be made as soon as doing so would not threaten security or compromise the investigation; and second, that our communications should be as open as possible without threatening security or compromising the investigation. At no time was not telling users even considered; instead, the argument was about how much to disclose and which details must be withheld in order to protect users. Although we discovered that no tokens had been used, we had been prepared for a worst case scenario in which these tokens were misused.
After the decisions about the authorization tokens, Additional customer support staff were engaged in order to assure that user questions could be answered quickly. The communications strategy would be supported by programs to provide information for users in general, technical users, and media, and to also provide a glossary of terms, a list of answers to Frequently Asked Questions, and updated blog and web posts. In planning for the possibility of a more significant breach, journalists were consulted and provided information under embargo in order to determine the most effective ways to communicate what had happened while neither causing panic nor resorting to bland euphemism. In the end, since there was no evidence of the compromise of social media data, this proved unnecessary.
A significant amount of the time it took to respond publicly was making contact with a large number of partners and sharing information with them to help with a complex technical investigation and coordinate an incident response. We thank all our partners for responding so aggressively and calling in staffers from vacation to help respond to this incident over a holiday week.