“Magic links” can end up in Bing search results — rendering them useless.
I recently started verifying user email addresses during signup to prevent bots, mistyped emails, and also to ensure that addresses are active for future emails and newsletters.
Being the lazy jack-of-all-trades developer that I am, I quickly threw together a rudimentary token system with the following flow:
- User registers with their email address (this also requires Google Recaptcha)
- Unique token is generated and then emailed to the user as a link that they can click to verify that email, and auto log them in
- Token is expired (which I forgot to actually do)
This is basic email verification, but it’s also quite popular now for “Magic Links” — a way to log in to your account without having to actually enter your password. (the assumption is if you have access to the email account, you’re probably who you say you are.
All was well for weeks, then suddenly I noticed an increase in logins, but little to no user action after the login event.
With some digging, I noticed all of these phantom logins were coming from Bingbot. At first, I assumed this was malicious behaviour with a spoofed user agent, but the IPs matched Bing’s, so it was legitimate.
But how was Bing was logging into user accounts?
A little more server logging and digging pointed to the email tokens. All of these Bingbot sessions started at the “verify email” URL, with the unique token appended. There was no referrer.
The only logical explanation was that Microsoft was sharing email data (links included) with Bing for indexing. We all know that these ESPs harvest our data, but surely they don’t index private email content…right?
A quick Google later (because nobody Bings) and I found this:
As of Feb 2017 Outlook (https://outlook.live.com/) scans emails arriving in your inbox and it sends all found URLs to Bing, to be indexed by Bing crawler.
This effectively makes all one-time use links like login/pass-reset/etc useless.
This felt like I had stumbled onto a Wikileaks level conspiracy. Microsoft is sharing private email data with its search engine?
I had to check to be sure…and my fears were confirmed.
Bing has been indexing my email verification links.
Bingbot was then automatically visiting these links, and automatically logging into the new user accounts.
Fortunately, they didn’t do anything after the login event, except click around a bit, and all of these were brand new accounts, so didn’t actually hold any sensitive data yet.
As a quick fix, I’ve deployed the missing token expiry feature (all tokens now expire after usage, and are only valid for 1 hour)
But I will probably move to a “here is your one-time code” format instead (no links) that the user must manually copy & paste into the webpage for extra peace of mind.
No doubt there are numerous other (better) ways to add security to this flow, and detecting bots is actually fairly easy these days, so I could add extra checks for that…but now that I know any email links are essentially at risk of appearing in Bing search results if not set up correctly, I don’t feel comfortable using “Magic Links” at all.