Timehop + Bishop Fox

We had the opportunity to work on this report with Bishop Fox regarding security and small errors.

The takeaway from Timehop is that when small security errors do lead to the worst-case scenario, a breach, be transparent and show your customers that you are working to quickly fix the issue. The one good thing about falling prey one of these errors is that they are usually are easily fixed. There are several other lessons from Timehop’s breach and their response, but ideally you will learn from them as well as the other examples discussed...
— Alex DeFreese, Bishop Fox

Letter of Recommendation: Timehop

Thanks to the The New York Times Magazine and Daniel Kolitz for putting this into words.

Timehop’s “memories” are really more like memory aids — maps to storage spaces on the outskirts of daily consciousness, haphazardly crammed with old biology partners, forgotten nicknames, the quality of light in an old basement bedroom. Often the maps are illegible: I could spend the rest of my life staring at my post of Dec. 20, 2009 (which reads, in its entirety, “tummy sticks!”), and come no closer to knowing its secrets. But when the maps work, the results can hit as hard as a song, a scent, a familiar face in a crowd.
— Daniel Kolitz, The New York Times Magazine

Check out the full piece in the July 1, 2018 print edition of the The New York Times Magazine.


What to know about turning on mobile ads

Here's how to create successful ads that enhance the in-app experience

Online advertisements have been referred to as being part of a broken system. They’re sometimes referred to as necessary evils. They’re often avoided altogether in favor of subscription models or some other ingenious way for companies to pay the bills.

But they’re never taken lightly, meaning they’re always at least contemplated with one question in mind: how will they affect my user, reader or customer?

If determining when and how to turn on advertising on the web seems tough, imagine thinking about the same decision on a mobile app with limited screen space, where slight modifications run the risk of alienating the most loyal, and core users, the ones who are most engaged and love the app the way it is. They’ll revolt at even the smallest change, let alone the introduction of advertisements that are usually tolerable and utilitarian at best, downright infuriating at worse.

App developers have to create an ad experience that integrates with their app’s overall user experience. The questions of when and how to incorporate advertising should be top of mind from the very beginning, with an understanding of the ways ads can be incorporated without taking away from the experience, but rather enhancing it.

If done well, advertising on mobile can be very lucrative. In-app mobile ad spend will reach $45.3 billion, up $11 billion from last year, and apps make up 80 percent of all U.S. media dollars spent on mobile, according to eMarketer.

Be intentional and communicate

App developers shouldn’t be afraid to incorporate ads because they think their users will leave. Many users, after all, understand why they exist. About a third of respondents surveyed by Hubspot said “I’m fine with the current situation. I see ads to support websites.”

In fact, users now anticipate for-free apps to have an advertising component, said George Thabit, Senior Manager of Platform Sales at MoPub.

“If you think about the maturation of the app market -- it’s been over a decade since the iPhone was released -- when users come across an app that’s free, there’s probably an expectation that there’s going to be some value exchange in the future, whether it’s implicit or explicit when downloading it,” he said.

This doesn’t mean any ad is acceptable. In the Hubspot survey, 68 percent said they’re “fine with seeing ads but only if they are not annoying.” Annoying in this case means ads that are disruptive and affect download speeds.  

For that reason, app developers have to be careful ads fit well into the app experience.

“Being intentional about the implementation of the ads is important. You need to have a measured approach and be mindful of the placement and format as well as the timing,” said Thabit. “Also, making sure that you QA your build prior to launch is super important because the biggest direct relation we see with negative comments of usage is if the introduction of an ad crashes the app, or takes away from the user experience.”

To this end, it’s essential to work with users and to communicate to them how and why ads are being turned on, said Ali Jafari, VP of Business Development at Nextdoor, a social network focused around communities. His company actually uses its own users to help the broader community understand what’s happening on the platform.

“Anytime we release a new product or a feature, we work with our neighborhood leads,” he said. “Every neighborhood has a lead, it’s sometimes the founding member; other times someone who’s raised their hand and said they want to be a lead. We work with them to give them context for why we’re releasing a feature, like advertising.”

If there’s pushback in the neighborhood, the leads explain that the ads are needed to support the free service to the community. There’s a conversation that takes place that brings the community in as though they’re part of the decision-making process. If they feel left out of the conversation, or neglected, they may jump ship, Jafari explained.

Think about the product early on

If you talk to app publishers, the conventional wisdom is that the audience should be large enough before ads are even worth turning on. There’s logic to this. Users don’t like ads and might be turned off if they see them right away (as I said before, they know they’re necessary but they also don’t want to them to be annoying). And typically, for apps with a small audience base, they’re likely not to have access to the highest-quality ads.  

To be sure, by making an ad-free experience the norm, app developers are setting themselves up for significant resistance.

“My observation is that it gets harder and harder, once you’ve built a large audience that is deeply engaged in an ad-free environment, when you introduce advertising or require another mechanic, like in-app purchases,” said MoPub’s Thabit. “People react negatively when the perceived value exchange changes and ads are suddenly introduced; they become a barrier instead of a natural part of the app experience. Thinking through prior to launch how you’re going to generate revenue is paramount because when you have your app launched with ads, the expectations are set and your users will be familiar with the ad options, whether they’re in-app purchases or ads that can be opted out by paying a one-time fee.”

Moreover, apps that don’t integrate ads from the start may require a costly overhaul to fit them in later, an expensive proposition for most app developers.

“I don’t think publishers could afford to really do a 100 percent UI change,” said Sameer Sondhi, VP & GM of Business Development for NA and EU at InMobi. “There are very, very rare publishers who have taken such bold steps, and it’s basically make it big or go home. There are companies who have invested so much there’s so much risk that you can’t afford to take it.”

Go native

Another major factor to consider when integrating advertising is simply the creative.

There’s numerous formats to choose from - video, banner, native. In 2018, mobile video ad spending is expected to grow 49 percent to nearly $18 billion. But not every ad type works for every type of app. In some cases video is going to distract from the experience, in others banner ads simply don’t fit. It’s imperative to understand how your specific app will best be able to incorporate them.  

“One of the most important factors to consider is the format,” said InMobi’s Sondhi. “Do you have the right mix of video? Do you have the right mix of full screen static interstitials? Do you have sticky banners? Where we really need to make a line here is that we don’t want to overdo ads.”

For Nextdoor, using native advertisements made the most sense. They can be broken down into two categories. The first, and likely the most well known form, is the type described by Fred Wilson in 2011. They’re basically ads that look like posts on a social platform.

“I’ve never been a fan of advertising that’s intrusive or gets in the way of what the user’s trying to do. With mobile, the bar is even higher than it is on the web because the screen size is so much smaller. There’s nothing worse than scrolling through a feed, or trying to read an article, and being redirected to a different place where you have no idea what happened, and you have no idea what information is being passed on. It does damage to the platform and the advertiser as well,” Nextdoor’s Jafari said.

“I don’t think we gave consideration to much beyond the native format. We’ve always wanted to make sure that the ads didn’t get in the way of what our members wanted to do on Nextdoor, which is to create a community and interact with members. We also wanted the ads to feel like content, delivering a message that’s value-add and natural to the kinds of conversations that may occur on Nextdoor.”

The other type of native is one that that uses programmatic tools to sell ad space through machine learning. These are ads that are programmatically delivered as assets, such as a photo, a headline, body copy, a link or a logo. The app maker makes an ad template that sort of looks like their regular posts.

Programmatic native is still nascent, said InMobi’s Sondhi, and hasn’t quite taken off in the United States yet.

“Native is not as strong in the United States, as compared to other countries. There aren’t a lot of demand-side platforms or bidders who are ready to buy native. Native is so custom. The payloads can be stitched by a customer, the click-to-action has to be the way they want, the templates have to be different,” he said.

For that reason, he recommends taking native slowly.

“For native to be successful, there should be a diversity of demand and the right mix of performance-based ads to choose from,” said Sondhi. “There aren’t a lot of demand sources or platforms that are ready yet at scale. So we do advise a lot of our publishers who have the intent to go native to draw it out, slowly. Turn it 20 percent, then 30 percent. That should be the cadence for rolling out native.”

Conclusion

Implementing an advertising strategy is tricky. It requires forethought and patience, and the ability to read your audience and fully understand how they use the app.

Few get it right the first time. Remember that even Facebook stumbled out of the gate when it went public in 2012 because investors weren’t convinced that it could make money from its ad product. In 2017, more than one in five digital ad dollars in the U.S. went to Facebook.

It doesn’t happen overnight but as long as you’re considerate and mindful, your ad strategy is more likely to be a success.


Timehop’s journey toward conquering mobile ads

Being ahead of the curve with “in-app header bidding” led to a 1200% jump in daily revenue

For years, we’ve known that mobile usage has been greater than desktop, having surpassed the web since 2014 to be exact. Yet mobile advertising options have been less than ideal, on mobile web or in app. This has left my company Timehop with no choice but to go on an unexpected and unconventional journey to build our own solution. In the end, it wasn’t only worthwhile, it was a fortuitous decision -- one of the best we’ve made so far in our young history.

To appreciate our story, you have to understand the two ways to make money via mobile advertising: through a browser or through an app. Mobile web is a relatively robust ecosystem, thanks to being similar in technology to desktop web. Yet there are several drawbacks to this approach in mobile. Ad blockers limit your audience. Customers may prefer to consume your content in an app. And you lose all of the user experience benefits of building an app.

On top of all of this, advertisers prefer the safety, accountability and efficacy of in-app advertising. In-app mobile ad spend comprises about 80 percent of all US media dollars spent on mobile, according to eMarketer. They’re estimated to have hit $45.3 billion last year, up from $11 billion in 2016.  At Timehop, in our early years, we spent considerable effort migrating our first few million users from an email-based service to one that the user consumes in an app. These larger advertising trends were part of our motivation.

The problems and promise of in-app advertising

While consumers spend more than two hours on apps each day vs 26 minutes viewing the web on mobile, those in-app audiences are confined to the top five players, such as Instagram and Snap. These large companies with hundreds or thousands of employees and large demand from advertisers are in a position to build their own ad-serving technology in-house, and dictate to the market - who is yearning for their inventory - how to buy their ads.

For everyone else - from the smallest app to large, top publishers - the solution is more complex. The default solution is to replicate what they’ve done on the web: have an in-house staff directly selling their inventory, then turning to third-party providers for technological solutions to sell the remaining inventory.

Yet because of the challenges of in-app advertising, and its differences from web, third-party solutions leave much to be desired. The technical solutions offered by third parties are more limited than web, and those that exist often make a middling attempt at replicating desktop web technologies, often at the expense of the improved user experience of an app.

In-app advertising solutions also often lag behind their mobile web counterparts. For example, header bidding. It’s all the rage on desktop web and in-browser mobile. Header bidding is a technique where ads are auctioned in the HTML header on the web, rather than as the page loads. All the bidding is done to multiple ad exchanges before the page is rendered. This is opposite of waterfall, where each partner is contacted individually. This new, popular approach to web advertising results in improved revenue for the publisher.

Header bidding obviously makes the selling of ad space faster and more efficient. Yet in-app “header bidding” is a nightmare (never mind that there are no HTML headers in mobile, the name has stuck for simultaneous auctions.) Says Digiday, in its rundown of the situation, “Like much of ad tech, header bidding was built to solve a desktop challenge. But mobile is eating media.”

Third-party technical solutions exist, but they are less than ideal. Many of them rely on software development kits, or SDKs. This means incorporating a big chunk of code into the app. Yet one doesn’t just have to implement the SDK of the company handling the auction. Imagine working with a number of ad partners and integrating each SDK for each one. On Timehop, we have more than 10 different ad partners; if we uploaded each SDK, it would make us a 600 megabyte app. We’d be overloaded and slow, and it would hurt the user experience.

There are other reasons app developers hate SDKs. There’s lack of control, and they’re extremely rigid. If I wanted to change something to make it look right for my users, I’d have to request this change from the ad partner.   

The paradox for us and for many publishers is clear: in-app provides greater advertising revenue theoretically, yet the technical solutions aren’t as effective yet. For many publishers, the paradox is immaterial since most of their readers rely on mobile web. Think about it, how many news publication apps have you downloaded? But for Timehop, where millions of users are interacting with our app every day, the problem is acute.

Which led us to start this journey.  

Enter Nimbus and a 12x revenue jump

This all explains why we created Nimbus, our header-bidding solution, that rids us of those app-bloating SDKs and enables us to be flexible. Our company consists of 15 people, and it was a sizable commitment of resources to go down this path. But having tried several of the “best-in-class” in-app advertising solutions, we felt we had no choice.

Nimbus is our ad server, which brings the “header bidding” process to in-app mobile. Nimbus holds a simultaneous auction for 10 (and counting) major ad networks at the beginning of every user session, delivering the highest-paid ad to the user. All without resorting to implementing innumerable SDKs. Prior to Nimbus, we managed clunky waterfalls - passing our user from one ad provider to the next, waiting for someone to bid. This resulted in lower income, and a degraded user experience. On top of that, we had a bloated app. At one point, we had four ad partner SDKs in our app. Horrendous.

No more. We started Nimbus in mid-November 2017, and had a beta launch by the end of that month, just in time for the holidays. We started with video, which generates higher CPMs. And, if I may be so bold: We killed it. Daily revenue grew by 12x during November and December.

After the new year, we started implementing static images, as well as viewability scores and anti-fraud features. Even though video inventory dropped, which put some pressure on revenue, sales are still a healthy 400 percent above where we started. Moreover, it’s only been three months since launch and already the product has paid for itself, meaning we’ve already made more revenue than the cost of development. Post holidays, we’re now at around 7x our pre-Nimbus revenue, with significant room to grow.

Building your own in-house solution isn’t easy. We had our own challenges, which include having a small team. Yet fortunately, we had the right mix of expertise in mobile development and programmatic advertising on desktop. The combined knowledge enabled us to build Nimbus. This kind of talent isn’t easy to find. There are probably only a handful of people who can do this in New York. And for any person with expertise in mobile programmatic, they’re likely going to work for Facebook or Google. We were lucky to hire a programmatic partnerships exec and two engineers from the same programmatic company that was going out of business. All this to say that it’s hard to replicate what we built.  

As for the audience size, there’s little value in spending the money to build a solution if you don’t have viewers to see the ads in the first place.

Publishers backs’ are against the wall

Of course, none of this would have been possible had we not had a sizable mobile audience to begin with. We think of ourselves as “too big to be small and too small to be big.” Millions of daily actives is a sizable number, so long as you’re not comparing yourselves to Facebook or Snap.

For large publishers, this is a conundrum. One study conducted by Nielson and The Knight Foundation showed a significant imbalance that news organizations have with regards to readership on their websites and on their dedicated apps. What’s clear: consumers don’t like downloading news apps.

According to the report, “mobile users who access news through apps spend more time reading the content, but the overall audience for apps is small.”  

 

For large publishers, their backs are against the wall. Audiences are moving to mobile. But on mobile, their audiences are looking at a browser, when the real ad money takes place in apps. And apps are dominated by Facebook, Twitter, Snap and their ilk. Says Digiday, “Apps theoretically present a huge opportunity for publishers since eMarketer estimates that 86 percent of the time users spend on mobile is spent in apps. But publishers have struggled to monetize their content in apps, and many of the most popular apps simply do not belong to publishers. A spokesperson for App Annie said that only two (ESPN and CNN) of the top 200 most-downloaded apps last month belonged to publishers.”

Therein lies the problem for these publishers. Let’s say I’m Coca-Cola and I call a news organization with a small-in app audience and say, “I’ve got a new ad campaign for Diet Coke with Lime, and the digital side of it is $20 million. I’m going to send it out in $2 million chunks spread over in-app advertising, branded content, maybe an event, or maybe a big show. Please send me a proposal.” The news organization would make their proposal to the brand, but without a large in-app audience, they’d essentially be missing one segment the brand wants to target.

This is what sets us apart from many other apps out there. We have an in-app audience and we have beautiful ads that are full screen and highly, highly viewable (we have excellent MOAT scores!)   

Looking forward

At Timehop, now both pieces are in place in-app: audience and monetization. It’s taken years since we first migrated over from an email list, but we’re there. We have a sizable audience of daily active users - several million. And we can now effectively monetize them. We can do it quickly. We can do it with a polished user experience fully integrated into our product. We’re even ready to accept full screen vertical video ads - some brands have re-purposed their Snap ads for Timehop. We would love to see more of that.

We’ve also begun setting up Private Marketplaces [PMPs] with brands and trade desks, giving them priority access to advertise with our users, helping us maintain quality advertising for premium brands, and sparing us from the more shady corners of the programmatic world.

Not only that, having both pieces in place has allowed us to control our own destiny. Staying a small team has helped, to be sure. And now with both pieces in place we can look forward to building the many, many other product innovations we have in the pipeline.

We’re excited for what comes next.

 


How apps broaden utility while staying simple

simplicity.png

Stay true to a mission and take away the burden of complexity

Close your eyes, and imagine a time in the not-so-distant past. A time when technology finally made the leap from research facility to the living room. When the future vision of what daily life would become started to include a lot more robots. That time was called the 1970’s, and it was magical. It introduced a whole new concept in helping humans connect with technology without needing to be a software engineer. It was called being “user-friendly”

The idea still rings true half-a-century later as technology increasingly pervades every aspect of our daily lives pre-robot revolution, including talking virtual assistants, smartphones, VR headsets, tablets, and, of course, mobile apps.

There’s around five million apps available for download on any given day and in one quarter alone more than 17 billion of them were downloaded. Or roughly 5.7 billion a month, 188 million each day, 131,000 each minute. It’s a crowded world, and people very quickly developed a sense of what was worth their time as 80 percent of all downloaded apps are deleted after a single use. It’s like dating, am I right?

Vying for a piece of one’s valuable time and space (quite literally the precious space on their phones not taken up by pictures of food) you have to fight for that quick “A ha!” moment. For Timehop, that’s the moment we show you your first photo from this day last year. It’s surprising. It’s delightful. It makes sense, instantly.

Our main goal has been to get users to that moment as quickly and painlessly as possible. It’s the apps that really know their “A ha!” moment and focus on it that rise above the noise. Apps that you just know what to do, like Lyft, Uber, Instagram, and Google Maps. What’s common across these apps is that certain je ne sais quoi. Except we do know what. It’s being “user-friendly.”

And such is the big challenge for companies trying to grow or broaden their functionality while staying elegant, simple and frictionless. It’s a constant concern staying true to a clear overall mission while tackling real complexities of a problem for your user. Oh, and it should feel personal. And delightful. And innovative. Easy enough, right?

Staying on the golden path and avoiding impending doom 

It’s incredibly easy to be distracted by shiny objects and new opportunities. When you’re working towards growth, every idea is a siren song, pulling your ship towards some rocky outcrop and impending doom. That may seem dramatic, but many underestimate how important it is for an app, or company as a whole, to have a clear vision of what they are solving.

If you don’t know that yourself, how can you be sure you’re communicating it well to your users? (This sounds a lot like RuPaul’s advice for a company Mission.) With Snapchat, for example, they’ve talked ad nauseam about leading with a live camera feed, steering users to always be creating and sharing.

Raj Kapoor, Chief Strategy Officer at Lyft, refers to this lead-to-action flow as “the golden path.” It’s where the app leads its users and, no matter what features or functionalities are added, this path never changes.

“We never want to interfere with the simplicity of the main process,” said Kapoor, adding that Lyft’s golden path is getting customers from point A to point B, while providing the most delightful experience throughout. “But when we do create new features, the final implementation is highly-debatable as the additional functionality shouldn’t ruin the golden path.”

Sounds easy, but consider that even the smallest change to the initial experience can reverberate throughout the app. Even the smallest of new features threatens to disrupt the balance of added value with your golden path.

“The challenge in all of this is that every action has a reaction. It’s like the Heisenberg uncertainty principle, whereby just observing an atom may cause its location to move because the energy of the observation moves it,” Kapoor illustrated. “The same thing applies here: by changing a feature, you’re impacting something else. People love to say that you can isolate it but if you really truly look at all the data, there’s nothing you can do in isolation.”

Of course, as always, there’s a downside. Staying too focused makes it easy to feel accustomed and accepting of non-change, thereby reinforcing the status quo vs encouraging out-of-the-box thinking, said Jack Chou, Head of Product at Affirm, a financial services app to buy products.

“By holding ourselves to such a high bar in terms of not adding features that clutter and detract from the existing experience, building something new can require a bit more inertia,” he said. “Another downside I can see is that you could become narrow-minded and myopic rather than open to to other opportunities and other ways people could be using the app.”

Apps have to find a middle ground, a way where they are able to open up new experiences for their users, while also not messing around with what makes that app special in the first place.

Take away the burden of complexity, and put it on yourself

But obviously it’s crazy to think about shutting out all new opportunities and experimentation. There’s a world of experiences to explore surrounding nostalgia at Timehop and we give ourselves freedom to have fun all the time. But how do we find that balance? For starters, we default to simplicity. We kiss: keep it simple, stupid.

“As long as it seems simple to the user, then we’re doing our job,” said Niki Sri-Kumar, Chou’s colleague at Affirm and Senior Product Manager focused on building out the company’s biggest feature rollout yet: Affirm Anywhere, which turns the Affirm app into a mobile phone credit card.

“If we’re not extraordinarily careful we’re going to start putting some of that burden of complexity on the user,” said Sri-Kumar. “We say that you can come to our app to manage your loans or to take out a new loan to pay for anything you want online. That sounds simple but, in reality, it’s complex from a technology perspective because we have to know how each change impacts everything from our handling of interest accrual to regulatory compliance.”

This sentiment was echoed by Chris Erickson, COO & Co-Founder of Apartment List, a popular app to find apartments. Part of keeping an app simple is to still tackle the complex and tedious problems for the user so they don’t have to do it themselves, he said.

“As it relates to keeping things simple while adding functionality, I think there’s two ways to do it: first, make sure your UI is intuitive and easy for people to accomplish the primary task,” Erickson said. “Second, take any other steps away from them that they’d likely have to do to accomplish that task. Do this in the background for them, so they don’t have to.”  

To personalize or not personalize, that is maybe the question or maybe not

Stay focused. Got it. But that doesn’t mean that simplicity is the same for all your users. The nuance of nostalgia is a little different for everyone. We’re humans with different lives, experiences, and technology. Not everyone wants the same Timehop experience and understanding this world of difference can make a big difference in the way they use your app.

“If you talk to a million different Lyft users, you’ll find a wide variety of potential improvements as transportation is so personal,” said Lyft’s Kapoor. “The challenge in product development is that if you solved every single problem, and applied it to everyone, you would make the experience potentially worse for everyone because the problems are not relevant to all use cases.  That’s where the art and skill really comes into play”

For example, in some areas of the country, particularly more rural parts, they tend to use the scheduled ride feature more than those in the city, where there are plenty of cars at any given time. “The question is: how do you deal with that? Do you change the interface if we know someone’s in a rural area? If you change it for everyone it distracts from the experience. So those are the kinds of tradeoffs that we have to go through,” he said, adding that the interface isn’t changed for rural users.

There’s a whole life of experiences every person has before ever opening up your app that makes for a different way to tailor to them. On top of that is an effort to understand how society’s collective habits may be changing and what that means for you.

“In general, we focus on where we think the world is heading,” said Chou. “We don’t really focus on showing different things to different folks or different demographics. But, there’s a reason that millennials are more scared of credit card debt than their own mortality, which is obviously born from many years or decades of seeing other generations of folks fall into credit card debt. When you combine that with the mobile expectations that they have, we really try to skate collectively as a product to where it’s all going and build for the larger audience that we expect to have over the decades.”

While doing that, the company goes on a journey with its users, building up a history of experiences. “Our goal hasn’t changed -- connecting renters with the best place for them,” said Apartment Lists’ Erickson. “What has changed is our ability to go deeper into the experience with them. As we stay with renters for a longer and longer time, we can add functionality that helps them, not just from that search for a place, but to communicate, decide and actually move into that place. 

In this case, the company’s main goal has not strayed, but it has more data, a richer understanding of people’s context, and better expertise to understand how to make that core use-case better.

Less is more

Apps have to walk a fine line between staying true to their mission and evolving to stay alive. But for every Facebook, which has embraced become something like 10 businesses in one, there’s a Twitter, a company that has openly admitted that it needed to be less confusing to entice more users. We’ll obviously see how that goes.

Clearly, there isn’t a one-size-fits-all rulebook for how apps evolve. Sorry if you came here looking for that. 404. It doesn’t exist. But underlying best practices and collective experiences, not the least of which is staying on that unfettered golden path and delivering on your promise, before promising more.

“The biggest factor in terms of growing is ensuring that people are excited about hiring you for one thing,” said Affirm’s Chou. “If you do a great job with that one assignment, it’s a lot easier to get them to hire you for a second job.” Erickson concurred: “Until we feel like we’ve really solved helping renters find the perfect place, we are going to be building features that solve tangential renter problems like managing their utilities or scheduling maintenance requests.”

There is a sense of calm and purpose in focusing on one job, one problem, one golden path--not only for the end user, but also for the teams trying to deliver who want clarity of mind. It’s a fascinating challenge for designers and engineers alike, as our instinct sometimes drives us to build more, not to build better. For Timehop, this means learning more about our collective experiences and focusing on the simplest path to reliving those memories together.

There is truth to the late 19th century mantra: Less is more. For app developers, this means having a clear vision of what your golden path is. This means embracing the beauty in how different every person’s experience is before they ever open your app. This means, every step of the way, asking themselves, “Are we making this easy for them?”


Society's obsession with nostalgia

nostalgia.png

Making an oldie a goodie to create a brand-spanking new adventure

We’re often told that if you want to move forward, don’t dwell on the past. Rather learn from it and live in the present since the past can’t be changed. Yet more often than not, people find themselves getting lost in their past experiences; the idealized versions of their memories both good and bad.

Here at Timehop, we obviously see the power that old memories have and their ability to shape our future. We’ve been reliving our awkward school photos for years, so it’s no surprise to us to see nostalgia back in vogue.

This fixation on our past is big business, just ask Hollywood. More than 60 percent of the top grossing movies released between 2005 and 2014 were adaptations, sequels, spin-offs or remakes. NBC, Fox and Netflix are bringing back shows, such as Will & Grace, The X-Files, Prison Break, Heroes, 24, One Day at a Time and Arrested Development. Advertisers also leverage nostalgia marketing: remember Domino’s Pizza ad last year with the guy from Stranger Things playing Ferris Bueller?

While we’ve seen a lot of success, bringing fan favorites back isn’t always easy. We’ve seen plenty of cases where something goes wrong and the remake lacks the same heart as an original. We saw the difficulty with the first remake of Planet of the Apes, a big-budget version of The Lone Ranger, and don’t get me started on I Love the 2000’s.

So how exactly do you make an oldie a goodie? It‘s a subtle balance of understanding why nostalgic content seems to have exploded; knowing how to re-engage original fans while creating a new generation of them; and overcoming the challenges in knowing when and how to use old-time favorites.

Technological change driving nostalgia

The 70s. What a decade. We saw the birth of the floppy disk, the mobile cassette player, and yes that brick-sized mobile phone. It was a time when more and more technology was brought to the consumer. During that decade, shows like Happy Days and movies like American Graffiti pulled in audiences yearning for the perceived simplicity of the 1950s. In those moments of collective progress when we see a new future opening up, we reminisce, perhaps as a way of making sure we don’t lose some essential part of ourselves even as we evolve.

I think it’s cyclical,” said Fell Gray, Executive Director of Verbal Identity at global brand consultancy Interbrand, a subsidiary of Omnicom. “I look at it as part of the next stage of technology that we are embarking on and imminent in the next few years with changes to the Internet of Things, and machine learning and voice interaction changing a lot of our behaviors. It seems natural for us to want to connect back to things that feel emotionally resonant and sometimes comfortable terrain.”

In other words, the nostalgia boom may be so prevalent because we’re in another age of fast technological innovation, with advancements in spaces like virtual reality and, most notably, artificial intelligence. The world has seen amazing leaps forward in the last 25 years; the world we live in now would be almost unrecognizable to someone from 1992. Yes, 1992 was 25 years ago.

Now things are ready to take off in an unprecedented way, and people may find some amount of comfort in what’s familiar. This sentiment was echoed by Todd Shallbetter, COO of Atari.

“Life has become so mile-a-minute and rapid-fire with our handheld devices and content overload, I think there’s a certain yearning and wistfulness to return to some of those comfortable, simpler experiences,” said Shallbetter.

If technology is sparking the need to hold on to the past, it’s also facilitating the ability to do so as well, noted Kevin Allocca, Head of Culture and Trends at YouTube.

We are in an era of increased engagement with nostalgia because of technology and digital media, essentially because of platforms like YouTube, social media platforms and Timehop,” he said. “There’s this broad accessibility to so many different types of content: snippets of movies that studios upload to YouTube or snippets uploaded to Giphy, images posted on image sites, which all allow for those kinds of connections to the past. At the same time, there’s now access to so many moments relevant and personal to us. And we’re enabled to experience nostalgia in our own way with our own moments.”

We’re journaling more of our lives than ever before. Technology has brought us to where we are now in terms of both wanting to, and being able to, relive the past in ways we have never been able to before.

Tapping loyal fans; gaining new ones

If we’re living in a time of yearning for the past, all you have to do is serve up old content with a few twists and fans will be forever fans, right?. Well, not so fast. If you’ve ever actually spent time on the internet, you know that hardcore fans have passionate opinions. The memories they’re attached to are an expression of themselves and any perceived threat creates unease and friction.

“I think part of my identity, and how I see myself, was shaped by the entertainment that influenced me at different points in my life,” said YouTube’s Allocca. “The new Blade Runner just came out and it’s not just that I love the whole aesthetic and incredible art of that film, but also I want to see it because I see myself as someone wanting to be associated with that film. I think that a huge driver of social interaction is expressing ourselves and expressing our identity and I think nostalgia content is part of the story of ourselves and sharing it allows us to share that story with other people.”

In other words, fans have an extreme emotional attachment to the content and an idealized and possibly lofty version of it in their minds. They have years of personal experiences to wrap around something as simple as an old movie. Attracting new fans is hard enough, meeting the expectations of the old ones is far more complex, requiring a deep understanding of its value and some clever tuning to modernize.

For example, Atari bridged generations of Barbie fans by incorporating newer elements [a more modern game] with older content [Barbie characters]. Atari recently did a Barbie integration inside Rollercoaster Tycoon Touch, one of the company’s mobile games. In the Fall Barbie event, players could add attractions to their theme park inspired by old Barbie toys. They we’re able to reconnect with the fans who grew up with these toys, now parents themselves, and reach a whole new generation.

“It worked quite well as a way to generate multi-generational nostalgia,” said Atari’s Shallbetter. “We know we’re not going after that three-to-five-year old girl but, in fact, we’re going after the mother who’s collected Barbie dolls her whole life and who has a fantastic 1968 Cher version.”

Bridging these generations requires a keen sensibility to cultural changes and market shifts.

“I think the secret sauce is really paying attention to the marketplace more than anything and analyzing your products or services or ideas against that marketplace and deploying as best you can with those influences in mind,” Shallbetter went on to say. “In developing products we will look at market competition or other games that we respect in the business, either for their financial performance or artistic excellence. And we may adapt our development ideation around what’s working well.”

The challenges of making nostalgia work

Beyond just appealing to original fans, there are other challenges in making nostalgia work, particularly in advertising. This has less to do with older content itself and more to do with how it’s being used.

What shouldn’t you do? Using nostalgic content as a shortcut or as a way of covering up for lack of a real message for a brand will likely fail, said Interbrand’s Gray. “Where it’s thoughtfully done is when a brand has already established a clear point of view and a clear emotional connection with their audience, then nostalgic content can be used as a bridge to another generation.”  

And then there is the risk of a legacy brand appearing out of touch by trying to tap into nostalgia where it doesn’t belong. The "How Do You Do, Fellow Kids?” problem, as I call it.

“If you’re a heritage brand, and your legacy is seen as old or tired, then tapping into that may feel tone deaf to what people are telling you. It may be more an opportunity to first start and look at the experiences you’re creating and that may be a way to find new relevance rather than looking back to go forward,” noted Gray. “I think there are some cases where if you’re looking across brands where there’s heritage and there’s already an existing behavior where people are harkening back or reapportioning some of the artifacts, symbols or content of your brand, if that behavior is already starting then I think there are thoughtful ways that you can feed that and encourage that and help that carry you into new conversations, new points of view,” she said.

Creating a ‘brand-spanking new adventure’

Whether it’s a blurry photo from our phones camera roll, an old and seemingly mundane tweet, or the cartoon you watched non-stop as a kid, it’s no debate that nostalgia has become a powerful emotional connector.  Some say it’s a driving force of our behavior as we seek to attach ourselves to idealized versions of our past to thereby bring them forward. Others, like us at Timehop, believe it’s one of the best ways to learn and grow from our personal and collective experiences. Apparently, we all do this. And we’ve been doing this since the beginning of time.

You might have heard this before, “What has been, will be again. What’s been done, will be done again. There’s nothing new under the sun.” It originated in the Ecclesiastes book of the Bible, re-hashed in a Shakespearean sonnet, and has been echoed ever since. It may seem to paint a bleak picture of monotony, but we think there is actually beauty in this sentiment. In contemporary parlance, we hear: “What’s old is new, again.” Everything gets a second chance with new experience and purpose.

This is not a message to rely solely on the old. It is an appeal to embrace the power that memories have to transport us. It is that nostalgic connection that can take something like an 80’s cassette tape mix and use it to take us on a completely new journey in 2017. Just look at Guardians of the Galaxy, the sequel. As one movie critic observed: Guardians of the Galaxy “was definitely the Marvel movie to beat… until Vol. 2 came out… offering the same blend of winning ingredients, but amping up the story… the result is a brand-spanking new adventure.”


One Year of DynamoDB at Timehop

dynamo.png

2,675,812,470

That’s the number of rows in our largest database table. Or it was, one year ago today. Since then, it’s grown to over 60 billion rows. On average, that’s roughly 160mm inserts per day.

When I think about it, that seems like quite a lot. But because the database we use is Amazon’s DynamoDB, I rarely have to think about it at all. In fact, when Timehop’s growth spiked from 5,000 users/day to more than 55,000 users/day in a span of eight weeks, it was DynamoDB that saved our butts.

What is DynamoDB?

I don’t intend to explain the ins and outs of what DynamoDB is or what it’s intended for, there’s plenty of documentation online for that. But I will mention the most salient point: That DynamoDB is consistentfast, and scales without limit. And when I say “scales without limit,” I literally mean there is no theoretical boundary to be made aware of. That isn’t to say that all of DynamoDB’s features behave identically at every scale, but in terms of consistency and speed it absolutely does. More on that below. First, some history.

The Ex I Worry About Seeing in My Timehop is Mongo

Timehop currently uses DynamoDB as it’s primary data store. Every historical tweet, photo, checkin, and status is persisted to DynamoDB as time-series data. It wasn’t always this way. Back in ye olden days, we used a combination of Postgres and Mongo. So why the switch?

Listen, Mongo is great. I mean, I’m sure it’s pretty good. I mean, there’s probably people out there who are pretty good with it. I mean, I don’t think my soul was prepared for using Mongo at scale.

When we were small, Mongo was a joy to use. We didn’t have to think too hard about the documents we were storing and adding an index wasn’t very painful. Once we hit the 2TB mark, however, we started to see severe and unpredictable spikes in query latency. Now, this might have been due to our hosting provider and had nothing to do with Mongo itself (We were using a managed service through Heroku at the time). Or, if we rolled our own we might have solved things with a sharded solution. But to be honest, none of us at Timehop are DBAs and a hosted database has always been a more attractive option (As long as it didn’t break the bank!). So we started to look at alternatives. Our friends at GroupMe had some nice things to say about using DynamoDB and it seemed pretty cost-effective too, so we gave it a go.

Early Mistakes with DynamoDB

As is often the case with new technologies, DynamoDB first seemed unwieldy. The biggest hurdle for us right out of the gate was the lack of a mature client library written in Go, our primary language (Things have changed since then). But once we nailed down the subset of API queries we actually needed to understand, things started to move along. That’s when we started making real mistakes.

Mistake #1: Disrespecting Throughput Errors

This one should have been obvious as the error handling docs are quite clear. DynamoDB has two main types of http errors: retryable and non-retryable. The most common type of retryable error is a throughput exception. The way dynamo’s pricing model works is that you pay for query throughput, where reads and writes are configured independently. And the way Amazon enforces throughput capacity is to throttle any queries that bust it. It’s up to the client to handle the errors and retry requests with an exponential backoff. Our early implementations assumed this would be a rare occurrence, when in reality, you will likely incur some amount of throttling at any level of usage. Make sure you handle this frequent type of error.

Mistake #2: Not Understanding How Partitions Affect Throughput

All DynamoDB tables are partitioned under the covers and each partition gets a sliver of your provisioned (purchased) capacity. The equation is simple:

Total Provisioned Throughput / Partitions = Throughput Per Partition

What this means is that if your query patterns are not well distributed across hash keys, you may only achieve a fraction of your provisioned throughput. Not understanding this point can lead to serious headaches.

It’s also important to say that the number of partitions that exist are not directly exposed to you (However, there is now a helpful guide to help you estimate the number of partitions you are accruing). A single table may have 1 partition or 1000. That means it’s difficult to predict exactly how hot a key can become. And the larger your table grows the more partitions that will be allocated to it. Which means you may not even notice hot key issues until you are at scale. The best thing you can do is fully understand how to design and use a well distributed hash key.

Mistake #3: Poorly Designed Hash Keys

DynamoDB tables can be configured with just a hash key, or with a composite hash and range key. Timehop’s data is in time-series, so for us a range key is a necessity. But because hash keys always map to a single virtual node in a partition, a large set of range keys per hash key can lead to hot-key problems. I dug into my old emails and found a few write capacity graphs that illustrate this point:

  March 2015

March 2015

  March 2014

March 2014

The red lines here indicate provisioned throughput, ie: the most you could ever use. The blue lines represent actual throughput achieved.

The top graph represents our current schema, which has a well-designed hash key based on a combination of user id and calendar date.

The bottom graph shows our first stab last March, where our hash key was just the user id. Notice the difference? Yeah, it was bad. It took us about 3 attempts at loading and destroying entire tables before we got it right.

Advanced Lessons

Once we stopped making the above mistakes we found further ways to optimize our DynamoDB usage patterns and reduce costs. These points are actually covered in the developer guidelines, but we had to make our own mistakes to reach the same conclusions.

Lesson #1: Partition Your Hash Keys

Every hash key will consistently hit a single virtual node in DynamoDB. That means that every range key associated with a hash key also hits that same node. If a Timehop user has 5000 items of content (a rough average) and most of that content is written at around the same time, that can result in a very hot hash key. Anything you can do to split the data across multiple hash keys will improve this. The pros and cons of doing so are:

  • Pro: Hash key distribution increases linearly with each hash key partition; Throughput errors decrease.
  • Con: The number of network queries required to write the same amount of data increases linearly with each hash key partition; the likelihood of i/o errors and poor network latency increases.

For Timehop our hash key partition strategy was a natural function of our use case: We partition all user data by date. This gives us a significant increase in hash key distribution. As an example, a user with 5000 items of content across 7 years of social media activity sees about a 2.5k times increase in distribution:

Without partitioning (ie: hash key = “<user_id>”):

5000 items per hash key on average

With partitioning (ie: hash key = “<date>:<user_id>”):

5000 / 365 days / 7 years = about 2 items per hash key on average

Lesson #2: Temporally Distribute Your Write Activity

As I stated earlier, a table that uses a range key can expect at least somethrottling to occur. How many errors you experience is directly tied to how well distributed your hash keys are. However, if you really want to maximize your throughput you should also consider temporally distributing the writes you make against a single hash key.

Our writes happen asynchronously when a user signs up and connects their social media accounts. At any given moment the write requests that occur are heavily skewed towards the subset of hash keys that represent the most recent new user. In order to encourage well-distributed writes, we built a simple write buffer that sits in front of DynamoDB (essentially a managed Redis set). A separate process drains random elements from the buffer and issues batch write requests to DynamoDB. The more users that sign up together, the more data that ends up in the buffer. The key benefit of this architecture pattern is the more data that’s in the buffer, the more randomization there is when draining it and therefore more distributed the hash keys are in our batch write requests. So increasing scale actually improves our write performance. Yay!

Scaling Without Limit

So now that we’ve got our usage patterns stabilized, what’s the real benefit? The most explicit win we saw from using DynamoDB was during our most significant period of user growth, which just happened to coincide with our final draft of a DynamoDB strategy.

During the 8 week period between March and May of 2014 our user growth started to spike upwards from roughly 5k signups/day to more than 55k signups/day. As you can imagine we were hyper-focused on our infrastructure during this time.

There were many many fires. Each dip in the graph probably corresponds to some outage or backend service falling over due to load. But what the dips definitely do not represent, is an outage or performance problem with DynamoDB. In fact, for the last year, no matter how large our table grew or how many rows we’ve written to it (currently in the order of hundreds of millions per day) the query latency has always been single digit milliseconds, usually 4–6.

I can’t stress enough how nice it was not to have to think about DynamoDB during this time (aside from a few throughput increases here and there). This was a new experience for Timehop Engineering as in the past the database was always the choke point during periods of growth. The stability and performance of DynamoDB allowed us to focus our efforts on improving the rest of our infrastructure and gave us confidence that we can handle the next large spike in user signups.

AWS describes DynamoDB as a providing “single-digit millisecond latency at any scale” (emphasis mine). We’ve asked the DynamoDB team what the theoretical limits are and the answer is, barring the physical limitations of allocating enough servers, there are none. In evidence of this, our table is quite large now, about 100TB, and the performance is the same as on day one.


Impedance (mis)matching

mismatching.png

In this post we’ll focus on the architecture of our Push Notification subsystem, covering some of the decisions made, talk about our open source Go library to send pushes through the Apple Push Notification System, and wrap it up with an anecdote on how we ended up DDoS’ing ourselves with what we’ve built.

Sunsetting the Rails Monolith

Over the past year at Timehop, we broke our big monolithic Rails app into a service based architecture, written almost entirely in Go.

Breaking a big system down into smaller parts makes it far more modular and, when done right, more available and fault-tolerant.

We ended up with more services but fewer single points of failure. There are now more potential points of failure but none of them can — or should — cause a complete halt.

One of the side effects of dividing to conquer is that communication becomes explicit. Functionality and error handling are now spread across multiple processes, over the network — which also makes them less reliable.

Impedance matching: buffering and throttling

At Timehop, we put a lot of effort into making sure that all communication between our systems is correctly buffered and/or throttled, so as to avoid internally DDoS’ing ourselves.

Whenever a system can offload some work for deferred processing by other systems, we use message queues as buffers. As those queues often grow to hold millions of records in a short amount of time, we keep a close eye on them through an extensive set of alarms.

Whenever a system needs a real-time response from another (e.g. an HTTP call or some other form of RPC), we use aggressive timeouts on the requesting side and throttling on the serving side. It’s all designed to fail fast; the requester won’t wait longer than a few seconds for a response and the server will immediately return an error if too many requests are already being served.

We would rather fail fast and keep the median response times low, even if it comes at a small cost in percentage of successful requests served:

  • From an infrastructure perspective, we’re in control of the internal resource usage. We decide when the system should stop responding vs let it grind itself to a halt.
  • From a UX perspective, we can often silently mask errors. When we cannot, we believe it’s still preferable to quickly show the user something went wrong over keeping her indefinitely waiting for that spinner.
  • From a service perspective, a degraded experience for some is better than no experience for all.

The push notification subsystem

We call it salt-n-pepa. I would have personally gone with static-x.

Whenever we need to send out a push notification to a Timehopper, we load up all her device tokens (one per phone) and then hit Google’s GCM or Apple’s APNS.

If you’ve never dealt with push notification systems, a device token is what Apple and Google use to uniquely identify your device(s) so that we can send you notifications.

With our monolithic system, we kept all these tokens in a PostgreSQLdatabase, which was hidden behind the niceties of Rails’ ActiveRecord. Grabbing the Apple device tokens for a user was as easy as calling a method on a User object — user.valid_apns_tokens.

As the need arose to perform the same tasks from multiple parts of our shiny new (but incredibly lean and minimalist) Go stack, multiple problems became apparent:

  • Duplicate code in different languages: higher effort to maintain two codebases.
  • Tight coupling to the database: some systems required a connection to the database for the sole purpose of loading tokens to send pushes.
  • Harder cache management: if cache invalidation on its own is often very tricky, then distributed cache invalidation is a nightmare.
  • Difficult upgrades: whenever the logic changed, we’d have to upgrade not only the different codebases but all the different systems using that code. The more independent moving parts you have, the harder this procedure is.

To solve those problems, we created a black-box service, salt-n-pepa, that has message queues as entry points. Messages (or tasks) in this queue are JSON documents, whose most notable fields a target user ID, some content and, optionally, a delivery time (so that it supports scheduling for future delivery vs immediate.)

The moving parts

Internally, the push system has multiple components, each with a single, very well defined responsibility.

  • The Demuxer: The entry point into the push system, it reads push notification jobs off of a queue — the message we’ve covered above. This process then loads all the valid device tokens for both APNS and GCM and, for each, it queues them to be immediately sent by the appropriate Pusher. In case the push is scheduled for future delivery it puts them in a timestamp-based set so the Deschedulers can then take care of moving it to the appropriate Pusher queue when the time comes. A single push notification job may end up generating multiple pushes if the user has Timehop installed in more than one device.
  • The APNS & GCM Deschedulers: At the right time, transfers pushes scheduled to be sent in the future to the appropriate pusher queue (APNS or GCM).
  • The APNS Pusher: Converts the contents of a message into APNS format and sends it down Apple’s Push Notification System. This is a fire-and-forget system, with no feedback on message delivery status. This process uses our open source Go APNS library, which we’ll cover ahead.
  • The GCM Pusher: Converts the contents of a message into GCM format and sends it down Google’s Cloud Messaging platform. This system is synchronous in the sense that for every request that hits GCM, we know whether the push was successfully scheduled or whether the token is invalid. When a token is invalid, the GCM Pusher queues an invalidation for the GCM Invalidator

Aside from these, there are also a few other components related to token registration and invalidation.

  • The APNS Invalidator: Periodically connects to APNS to download a list of invalid tokens and update our Apple device token records.
  • The GCM Invalidator: Reads off of the GCM token invalidation queue (populated by the GCM Pusher) and updates our GCM device token records.
  • The Registrar: Reads off of the device token registration queue (populated by other subsystems that want to register new device tokens for users) and updates the device token records for the user.

With this system we send, on average, 25 million push notifications every day.

Timehop’s Go APNS library

One of the hardest parts of this whole system was writing the actual code that talks to APNS to send the pushes.

Whereas with GCM you perform an HTTP request and immediately know the results, Apple took on a less common approach in which you have to open a TLS connection and adopt their binary protocol. You write bytes to a socket instead of HTTP POST’s to a Web server. To gather feedback on which tokens are now invalid, you have to open up a separate connection to receive this information.

As we looked for good libraries, we realized the landscape was grim so we decided to roll our own, which features:

  • Long Lived Clients: Apple’s documentation states that you should hold a persistent connection open as opposed to creating a new one for every payload.
  • Use of v2 Protocol: Apple came out with v2 of their API with support for variable length payloads. This library uses that protocol.
  • Robust Send Guarantees: APNS has asynchronous feedback on whether a push sent. That means that if you send pushes after a bad send, those pushes will be lost forever. Our library records the last N pushes, detects errors, and is able to resend the pushes that could have been lost. You can learn more about this here.

So head on to the GitHub project page and give it a spin!

How we DDoS’ed ourselves with pushes

Every day, the system that prepares your next Timehop day (briefly discussed in this other article) enqueues about 15 million push notifications to be sent shortly before 9am on your local timezone. This scheduling is randomized within a 30 minute window, so that for every timezone, we get an evenly distributed traffic pattern — as opposed to massive influx of traffic when everyone opens the app at the exact same time.

All this is performed far in advance of the actual push being sent so we end up queueing plenty of messages, which the de-schedulers will then move on to the appropriate queues to be sent immediately when the time comes. It’s normal to have a few million entries scheduled for later delivery.

The actual sending on the APNS side is pretty fast. It takes about 2ms to run a full cycle — pop a notification from the queue and send it to Apple’s Push servers. Rinse and repeat.

We run a single process, in a single machine, with 50 workers (each in its own goroutine). It’s so fast that its queue never backs up, no matter what we throw at it.

It’s one of those things that has been so reliable for so long that you kind of forget about it when there are other fires to put out. So reliable and fast we forgot to put alarms in place for the case when its queue starts backing up.

And then it got fun.

What goes around, comes around

We never really put thought into limiting the outbound rate of our pushes — as long as Apple could handle it, we’d hammer them.

What we naively overlooked was the fact that pretty much every push we send causes an indirect hit on our client-facing API, as the users open the app.

The morning push: nobody can resist opening the app after one of these.

The higher the volume of immediate pushes sent, the higher the potential volume of hits on our API.

A week ago, due to a certificate problem with our APNS pusher, each of the 50 workers running on the APNS Pusher slowly started to die. We didn’t really notice anything as, even with just a couple workers left, we were still keeping up with the rate at which pushes were being generated.

Then, the last worker died. No more APNS pushes were sent.

While we did not have an alarm in place, the unusually low morning traffic that our dashboards were showing was not a good sign — that and the fact that we didn’t get our own morning pushes either.

As we investigated and reached the natural conclusion that the APNS Pusher was dead — at that point, the queue had over 6 million pushes and growing — we restarted it.

Within 30 minutes, our client-facing API error rates went up by 60% and our inbound traffic went up nearly 3x. When we looked at the push queue, it was empty. Over 6 million pushes sent under 40 minutes. Most of those were people that actually opened Timehop and hit our servers.

An incredibly simple rate limiter

All it took for this to never happen again were a few lines of code. The algorithm is pretty simple:

  • Each worker, running on its own goroutine, has access to a rate limiter
  • Whenever they’re about to begin a cycle, they query the rate limiter
  • If the rate limiter is over the threshold, the worker sleeps for a bit
  • If the rate limiter is under the threshold, the worker performs a work cycle
  • Every minute, another worker resets the rate limiter

Kinda like pushing the button in LOST.

Here’s what it looks like:

import "sync/atomic"

func NewLimiter(limit int64) *Limiter {
  return &Limiter{limit: limit}
}

type Limiter struct {
  limit   int64
  counter int64
}

// Atomically increments the underlying counter
// and returns whether the new value of counter
// is under the limit, i.e. whether the caller should
// proceed or abort.
func (s *Limiter) Increment() bool {
  return atomic.AddInt64(&s.counter, 1) <= t.limit
}

// Atomically resets the value of the counter to 0.
func (s *Limiter) Clear() {
  atomic.StoreInt64(&s.counter, 0)
}

The limit is then shared across all the workers (goroutines) and whenever they’re about to begin a new cycle, they simply test whether they can proceed:

func (s *apnsWorker) workCycle() bool {
  if !s.limiter.Increment() {
    return false
  }
  // ...
}

Lastly, another goroutine calls Clear() on this shared Limiter every minute, which allows the workers to begin sending pushes again.

A final note

When going distributed you’ll invariably run into throughput impedance mismatches. Make sure you dedicate some time to understand how every part of your system will affect the next and how you can use different techniques, such as the ones we talked about in this article, to help mitigate the effects.

Oh, and always keep an eye out for how outbound traffic can get back at you so you don’t end up nuking yourself like we did! 😬


Why Timehop Chose Go to Replace Our Rails App

switch.png

Here at Timehop, we’ve been running Go in production for over a year and a half and have fallen in love with it. We’ve gone from having our median response time of 700ms with our Rails app to a 95th percentile response time of 70ms. We do this while serving over six million unique users every single day.

I thought I’d share the highlights of our conversation in hopes that it would be useful to other engineering teams considering Go.

What prompted you to first consider Go?

We originally built Timehop as, like many startups, a Rails app. Then we started growing extremely quickly and what we built in Rails couldn’t keep up.

A lot of what we do lends itself to being parallelized. Whenever a user opens the app, we gather data from all the years they have content. The queries we issue to our database are independent of each other, which makes them very easily parallelized. We tried making it work with Ruby, but ultimately, Ruby doesn’t have true multithreading and the abstractions to work around that limitation felt brittle.

When we set out to explore new languages we had three things at the top of our wish list: easy concurrency, true parallelism, and performance.

Why did Go win compared to the other languages you considered?

We looked at a few other languages (Node.js primarily), but ultimately Go had sold us on three major things:

  • Performance — Go code is compiled down to machine code. There is no VM or interpreter that adds overhead.
  • Static typing — Turns out computers are way better than humans at a whole class of errors when you know what type a variable is. Who knew?
  • Sane, readable concurrency — Goroutines and channels make concurrent code easy to read and reason about. Those constructs also make safe concurrent code without needing explicit locks. Also, no callback spaghetti.

Those were the initial points that really sold us on it. As time went on, we added to the list:

  • Dead-simple deployment — it compiles down to a single binary with all of its dependencies. More on that later.
  • Amazing toolchain — Go also includes tons of amazing tools, not the least of which is the code formatter`go fmt`. It has eliminated code formatting debates, and with it, an untold amount of wasted developer-hours.
  • Extremely robust standard library — we’ve found that we haven’t needed a ton of third party libraries. The libraries provided by the language are extremely well thought out.

Go checked off all of the boxes for our requirements — and then some. The additional benefits made it a clear winner in our book.

Were there any surprises — positive or negative — once you started using Go?

We were all hesitant about losing productivity in the switch. Ruby is a very expressive programming language which allowed us to write a lot of code quickly. We were concerned we’d lose that switching to a type safe, compiled language.

We didn’t need to be concerned. Very quickly after the switch we found ourselves writing Go code just as fast and a lot safer. Go’s type safety prevented a lot of the fat fingered mistakes that are all too common in Ruby.

Compiling our code also turned out not to be an issue — our largest Go app compiles in ~2.5 seconds at worst.

How did the team ramp up? What’s your advice to help teams go through this process smoothly?

TL;DR: Tour of GoEffective Go, and reading the standard library.

What are Go’s weaknesses?

Dependency management.

Go has a convenient import scheme where you include packages by their location, ie.

import “github.com/timehop/golog/log”

You can pull down the code with a simple “go get” command in your terminal. While convenient, it can also be deployment headache because it pulls the HEAD from the repo. This can get in the way of shipping a feature because someone else’s code changed locations or had breaking API changes.

The most popular dependency management tool right now is Godep. At a high level, Godep pulls all of your dependencies and then vendors them into your project — essentially copying your code into your project so that you always have a working copy of your dependencies.

It is worth mentioning that this weakness is by design. The creators of Go have specifically avoided building a dependency system because they wouldn’t know what a general solution would look like to others.

What Go libraries are critical to deployment on the modern web?

When we first started writing Go, we googled around for “Rails for Go.” We quickly realized that was overkill for building out JSON API.

All of our web services simply use the standard net/http library and Gorillamux for routing. Others seem to do the same.

What are options for hosting? How does deployment work?

We started on Heroku because our Rails app were hosted there as well. It was simple to deploy using the Go buildpack.

We eventually migrated to EC2 and deployment was just as easy. Go apps compile down to a single binary, so it can be as simple as scp-ing the binary to a box and running it.

Our current process is:

  1. Push code to Github
  2. Run tests via Travis
  3. On success, build the binaries, tar them, and upload to S3

The app servers simply have to pull down the tar, unpack it, and run it (we do this via Chef)

We haven’t needed to use this, but it also makes compiling binaries for different architectures as easy as:

$ GOOS=linux GOARCH=arm go build main.go

That means you could easily build your app for many types of operating systems and embedded devices.

Is the language suited for building APIs?

Yes. The Go encoding libraries make writing APIs stupidly simple. Take, for example, a User struct (which is similar to a class, but distinctly not):

type User struct {
  FirstName string `json:"first_name"`
  LastName string `json:"last_name"`
  Password string `json:"-"`
}

Those tags after those fields define how they’re going to be serialized out to JSON. No need for custom serialization functions.

To output an instance of a User, it is simply:

u := User{FirstName: "Abe", LastName: "Dino", Password: "p4ssw0rd"}
jsonBytes, _ := json.Marshal(u)

How does the language deal with polymorphism and modularization?

Go isn’t an object-oriented language — it doesn’t have any type hierarchy. While it initially took some getting used to, it quickly became a positive. Because of the lack of inheritance, it naturally encourages composition.

It also has really nice interface system. Unlike other languages where the class has to explicitly declare that it is implementing an interface, in Go, a struct simply is. It’s all very philosophical. We’ve found it to be one of the most powerful features of the language. You can read more about why here.

How important is Google’s involvement in the project?

It’s huge — having Google release and continue to invest in the language is great for everyone. It’s clear they’re using it internally and that can only mean continued improvement to the language, tools, and ecosystem.

In addition to the backing of Google, the community has grown by an insane amount since we started. It’s been amazing to see the community come alive, which has been a boon the amount of open source code and content now available in go.

Hopefully, this was helpful for teams trying to decide whether Go is worth a look. There are lots of questions whenever you consider a new language, and hopefully we’ve answered some of those here. We love writing Go and have had a ton of success doing so.


Why shutting down Timehop’s daily email was the best decision we ever made...

shutdownemail.png

Timehop wasn't always the Timehop that it is today. Foursquare & 7 Years Ago was the name of the daily email service that Timehop started out as in 2011, created at Foursqaure’s first ever hackathon. Users received an email each morning telling them where they checked-in on Foursquare on this day…365 days ago. Fun. But they wondered, would people actually want to read this everyday?

A nice surprise to them was that, yeah, people did. In fact, around 60% of users read it daily. So they created a version for Facebook posts, as well as version for Instagram. Those all eventually merged into one daily email called…Timehop! All was well in time travel land. People loved their email.

So why nix a product 90,000 people loved? We're about to dig into that, but it would up being one of the best decisions we made.

Another thing that might be a little hard to believe is that it was actually a pretty easy decision. We had a very highly engaged email user base, but they were not helping with our growth. Users enjoyed opening the email, but we could not get them to do anything with it.

Timehop.com

Here’s the rub: users cannot interact a lot with their email. To try and overcome this issue we created a social experience around timehop.com where people could interact around the past, see their friends’ feeds, and use other “social” concepts that gave users tools to be more interactive. It was exactly what we needed because it allowed users to share their posts with friends, which leads to us gaining more users. Problem solved?

Nope.

The new challenge for us was getting someone from their inbox to Timehop’s website. It’s a lot harder than it sounds. We tried some nice ways and some sneaky ways to get users to Timehop. We once tried only sending users a portion of their Timehop in the email and giving them a button in the message that took them to our website where they could see the rest of their day in history. I realized this was a little mean, so we stopped teasing our users.

The team and I went back to the numbers and saw that over half of our users were opening the email on a mobile device. At first I did not think much of that stat because the Timehop email looked great on a phone. Why would we make an app when a Timehop email already had a working mobile experience? When the growth we needed was still not happening, I realized we had to do something else.

iPhone App

This is when the idea of an iPhone app made perfect sense. An app would allow for more depth, more interaction from our users, and it would hopefully have the same level of engagement as our daily email. We took the time to build the app, and even on day 1 in the App Store we had more new downloads than new users signing up for the email. iPhone users were also coming back each day to see their Timehop, just like those signed up for the email. Problem solved?

No : (

Now our small team was running a daily email service and an iPhone application (no time for fun). It got to a point where I would get to the office in the morning and my engineers would tell me our emails did not send today, but we also have to keep working on important things for the app. It was pretty obvious that investing our time into the email would not help our growth because for a year we were stuck at about 200 signups a day. With the app, people can easily share their content- which helps drive growth.

Shutting It Down

We ditched the email (and over 90,000 users) about 9 months after the release of the iPhone app. Here is the email we sent our users:

Dear Timehopper,
You’re receiving this email because you’re subscribed to Timehop’s daily email service that tells you what you did 1 year ago today on Facebook, Twitter, Foursquare, etc.

We wanted to let you know about an important change. We’re sunsetting the Timehop daily email and pulling all our efforts behind the Timehop mobile app. We appreciate your support and hope you’ll understand that as a small startup we have to pick our battles carefully.

We’ll stop sending the daily Timehop emails in 5 days: Wednesday July 17th.

If you have an iPhone/iPod/iPad? Get our app: timehop.com/iphone
If you have an Android or another phone, we don’t currently have an app for you but hopefully we’ll get there in the future. If you’d like to help us with this, we’re hiring — get in touch!

Thanks for your continued support — and see you on mobile!
Team Timehop

Problem solved?

NOPE!!!!!!

Below is a screenshot of my inbox which was full of hate mail from our users responding to the shut down of the email service.

0_91Z5PF5Xycc9aIGp.gif

This all was a little heartbreaking because some of these users had been with us since 4square&7yearsago, and if they did not have an iPhone they could not continue receiving their daily Timehop. We could not make everyone happy with this decision, but in order to save our time traveling ship from sinking the email had to go.

Was it Really the Best Decision?

You bet. Cutting the email really allowed us to focus on creating a great mobile product (who doesn’t love Abe?). It meant the team’s focus was not split and the messaging around our product was the same. We used to send two sets of emails out to our user base: one with information for our email users and one with information about the app.

The decision also improved management of our infrastructure because it was always a bit fractured. The infrastructure to send emails is very different than the infrastructure needed to support a mobile app like Timehop.

Now, more people download Timehop every two days than all people who ever signed up for the daily email.

Best. Decision. Ever.