A few notes about login systems, some day I might turn this page into a better written document. The context is web applications, but the same principals might be applicable to other domains.
A login system is the piece of code which deals with user authentication. The base case is logging a user into a website when they supply their username and password. There are however many other cases that need to be handled: resetting a password, detecting account hijacking, remediation, etc.
It's interesting to note that a well designed login system is not going to mitigate other security issues (i.e. a SQL injection can lead to a full compromise). However, security issues in the login layer can fog the audit trail; everything will look as if the legitimate account holder took certain actions when that might not be the case.
I encourage you to send me questions or suggestions, I will keep updating this page as required.
A few assumptions...
- I'm going to assume the login system works with three pieces of information: an account id, a password and a contact information. The contact information is usually an email address or a phone number, but it could something else too.
- I'm going to assume that we have some crypto functions (e.g. Encr(key, data) and Decr(key, data)). Let's ignore the key distribution, rotation, revocation issues for now. The ideas here can be used with symmetric or asymmetric crypto.
- Weak passwords.
- Password sharing across many sites.
- Malicious employee forging a session.
- Malicious employee adding a backdoor (e.g. adding a security question).
- Password database leak.
- Compromised email address → password reset flow.
- Phishing sites.
- Malware (in the browser → malicous plugin, OS → virus, physical → keylogger)
Sign up flow
Before a user can log in, they need to create an account.
- Many tradeoffs are possible:
- ask for the contact information (email or phone number) + password, require the user to confirm ownership of the contact information (via email link or sms).
- ask for the contact information (email or phone number) + password, let them confirm ownership of the contact information at a later point in time (e.g. within 7 days)
- ask for the contact information, let them create the password at a later point (using a password reset flow).
- Generate a new unique account id. It could be fully random (e.g. increment sequential number, hash it, start over if it's not unique), or sequential. I recommend randomly distributed in the account id space.
- At sign up time, the system should only ever store salted + hashed passwords. Password hash functions are typically: scrypt, bcrypt or pbkdf2.
- Make sure that browser's password manager saves the signup password and re-uses it for the login flow.
- For phone number + password:
- Check ability to send text messages to all phone numbers.
- Check ability to normalize phone numbers, i.e. +1 650 123 4567 is the same as 1 650 123 4567, but 650 123 4567 is context dependant!
- Normally, you want the sign up flow to end with a session. I.e. don't require the user to login right after they created an account.
The base case is prety simple.
- Given the contact information → lookup account id → lookup salt → compute hash → compare hashes.
- Do you "leak" at login time if the contact information exists or not? (you will most likely leak that in the registration flow?)
- Brute force protection:
- per account id counter.
- use a "same ip" or "same machine cookie" check to reduce legitimate user lock off (i.e. layered counters).
- have different thresholds and use Captcha to reduce automated lock out.
- Automatically fix typos:
- case inversion in password.
- first letter case inversion in password.
- domain level correction, e.g. @gmail.con → @gmail.com (because there is no MX record for gmail.con).
If the login fails, there are ways to help users:
- Show some public information from the account (e.g. profile picture) to confirm they are trying to log into the right account.
- Replace password input field with a text field, so they can see what they are typing.
- Provide link to the password reset flow.
- you can make the session == account_id + Encr(site_key, account id), but then you can't revoke a session.
- you can make the session == account_id + Encr(account_key, account id), and revoke all the sessions at once (by changing the account_key).
- you can make the session == account_id + database backed token. You can revoke individual sessions. Limit max number of simultaneous sessions?
- session vs persistent cookies: if you want the user to remain logged-in when they close their browser and return, you need to set an expiry date.
- cookie refreshing: you get the value back of persistent cookies, but not the expiry value. If you make the session == timestamp + account_id + ..., you can refresh the session before it expires.
At some point, Internet Explorer had a bug. User logs in → browses site → logs out → hits back button a few times → browser
re-submits login credentials. This bug could lead to account compromise in shared computer settings. Work around:
- Use an intermediate page and redirect to different url works?
- Use a token in the login form, detect form re-submit.
Session termination flow
- If you have persistent cookies, you need a termination flow.
- Make sure session is purged from database + cookie is deleted in browser.
- Do you terminate all sessions or only current session? Some services implement "single sign on", but not "single sign off" :/
- Make sure back button doesn't give you back a session (cache invalidation). Make sure back button to login page doesn't submit the login form (does the IE bug + double redirect hack still apply?).
- Password change should trigger session termination. Alternatively, there should be a way to view active sessions and terminate "other" sessions.
Password reset flow
- Given a contact information → send a reset code / reset link. You want the code to be one time use, limited in time: account_id + database backed token.
- The user gets to set a new password by visiting the reset link (or manual reset code entry).
- Normally, you want the reset flow to end with a session. I.e. don't require the user to login right after they reset their password.
- Should changing the email address on the account invalidate password reset links sent to other addresses?
- Should changing the password on the account invalidate the password reset links?
Handling account information change
People sometimes forget to log off from public computers (school, library, internet cafe, etc.).
- Adding or removing a contact information as well as password change should require a re-authentication to prevent account hijacking.
- Such changes should trigger an email/sms notification.
- Such changes should be reversable (e.g. link or flow to revert the change).
- The re-auth flow should have the same rate limiting check as a normal login flow.
There are cases where the password reset flow falls short, e.g. if a user uses the same password for their gmail and your service, both accounts will get compromised at the same time!
- Security question. If you do fuzzy matching, you need to store the security answer in plaintext :(
- Require multiple codes, i.e. two factor authentication + email reset link.
- Have pre-vetted friends (aka guardians) receive a piece of the reset code. (see ssss).
- Micro transactions of random amounts, e.g. $0.02 & $0.42 on user's credit card / bank account.
A persistent cookie that is set only once for a given browser.
- Login notification.
- Can be used to decide when to trigger two factor authentication.
- cookie refreshing: you get the machine cookie, but not the expiry value. Make the machine cookie == timestamp + id, you can refresh the cookie before it expires.
- Knowing that multiple people share a computer → don't give persistent cookie (or remember higher risk of compromise).
- Potential privacy issue: ability to tell that two accounts share a computer. In the backend you can store Hash(machine cookie, account id) to make it a little harder to find which accounts share a computer.
Two factor authentication
Two factor authentication (2FA) adds an extra layer of security. There are many ways to implement 2FA, and you can require it on every login, or only when something is suspicious.
- SMS code. Not reliable if SMS ends up in email inbox!
- Push notification
- Touch token
- Phone call
- Soft tokens (Google Authenticator)
- Hard tokens (RSA, Vasco, Feitian, etc.)
- Printed one time codes
- Printed grids
- Client side certificates.
Detecting (potential) compromise
Some of these provide weak signals, so consider using them as inputs to a machine learning system.
- Machine cookie.
- IP address → geo position. Fails for people who travel a lot (pilots, drivers, etc.)
- IP address → ASN.
- TCP stack fingerprinting.
- Browser fingerprinting.
- Public password dumps → offline or login flow → detect matches. You can do this even if the passwords are hashed in different ways, since the plaintext exists in the login flow.
- Spam or other activity on the account.
- User reports.
- Login attempt with previous password → password might have been compromised and then changed.
- Time of day / day of week usage analysis.
- When a session ends, mark it as ended in the backend. If activity continues, the session might have been compromised by malware or a man-in-the-middle.
- Use a Captcha to avoid bots.
- Provide the user with a way to reset their password.
- Provide the user with a way to review / terminate sessions.
- Provide the user with a way to revert any account changes.
- Message the user what happened (i.e. what triggered the compromise handling flow). This can be hard if the decision comes from a "magic blackbox".
- Handle compromised machines (malware removal).
Prevent employee-forged sessions
- Easier if employees don't have access to production databases. Doable but harder otherwise.
- Need to make sure that employees can't override a password hash.
- Need to make sure that employees can't add an email address, phone number, security question, etc.
- Need to make sure that employees can't forge or discover password reset code. Hard to do in practice, how do you let engineers debug any email or sms delivery issue without preventing them from reading the data?
Using third party authentication services (OAuth, OpenID, Facebook Connect, etc.) is not only about getting social context (user's name, age, friend list, etc.) but also avoids having to build and maintain the login system itself.
I recommend studying past security bugs in this area.
In some situations (e.g. supporting legacy protocols like SMTP/IMAP/XMPP) you need to provide the ability to set secondary weak passwords. Need to isolate what can be done with such passwords.
- OWASP Authentication Cheat Sheet
- Engineering Security by Peter Gutmann, see page 498
- OAuth security
- Password storage
- Two factor authentication
- Past security issues
- Facebook Blog