Anonymous User Id Generation#
Status#
Accepted
Context#
The student app provides a mechanism to generate multiple anonymous ids for a
student. The anonymous ID can be independent of all courses or it can be
course specific. To generate the anonymous ID, we currently hash the user’s
id with the Django SECRET_KEY and a course key if provided. The
mapping between the anonymous ID and user id are saved in the
AnonymousUserID table.
As it stands, if the SECRET_KEY is rotated students would get new anonymous
IDs starting immediately after rotation. This can cause downstream issues
where the IDs are output from the system. For example, the IDs are in tracking
data and could be used to track a user’s activity through a course for research
purposes.
Decisions#
Once an anonymous ID is generated for a user in a particular LearningContext
(either a course or some other unit of learning), it will remain that way even
if the secret used to generate the ID changes. For any context where an
anonymous ID does not already exist, a new ID will be generated using the
latest SECRET_KEY.
Consequences#
By keeping old IDs static, we increase the risk that if the salting
data(SECRET_KEY) is leaked, then it can be used to determine and correlate
all anonymous IDs associated with a particular user across all courses. We
believe that this is a worth while risk to not break downstream services that
are using anonymous IDs during the lifetime of a course.
Rejected Alternatives#
Make Anonymous IDs Randomly Generated#
The function that generates anonymous IDs, has the option to not persist the newly generated ID. In this case, it would give a new anonymous key each time the function was called, instead of being consistent other than at key rotation. The downstream consequence of changing the SECRET_KEY that often are unclear so we opt not to do so at this time. In the future if we can ensure that the newly generated IDs are always persisted, we could more safely use random generation.