Visitor identification

This is a semi-technical description of how secure visitor identification with OpenID Connect (OIDC) works in the Puzzel web engage platform. OIDC is one of the most widely used authentication protocols to verify the identity of end-users in a secure way.

1. Standard OIDC-flow

An OIDC explicit code flow allows a third-party application to securely get access to e.g. user info from an IDP, or get access to other protected resources by redirecting the user to an OIDC-provider/Authorization Server connected to the resource.

When the user visits the OIDC-provider, it will typically check if the user already has a valid session (i.e. is logged in), or prompt the user to login, request user consent, etc. This all takes place on the OIDC-provider’s domain, ensuring that any information entered by the user (such as username, email, password) is not exposed to the originating application.

Once the user is authenticated by the OIDC-provider, it will redirect back to the originating application, usually with an authorization code appended to the URL. The application can then exchange this code for an access-token that allows it to access information from the IDP or other protected resources.

To secure the code-exchange, the application typically has a pre-configured secret that it needs to supply along with the authorization code to exchange it for a token. The OIDC-provider will only redirect back with an authorization-code to URLs explicitly whitelisted in its configuration. This ensures that an authorization-code is only sent to known and trusted applications, and that only applications that also know the secret can use the code.

2. OIDC-flow in web applications

The standard OIDC-flow described above works well when the application is protected (e.g. running as a back-end service, or is a native app in a phone). However, for web applications running in the users’ web browsers, it is not possible to store a pre-defined secret securly.

To secure the code exchange in a web application without a secret, the OIDC-flow is usually extended with PKCE (“Proof Key for Code Exchange”). PKCE is a mechanism that enables an OIDC-provider to verify that a token exchange request with a code is allowed only by the application that initiated the flow.

3. Puzzel visitor identification with OIDC

The purpose of “secure visitor identification” in Puzzel web engage can be described as: Providing a reliable and secure way to connect/link a visitor’s puzzel-session (i.e. the identity of a visitor in Puzzel’s system/platform, which is usually an anonymous, short-lived session) with the visitor’s identity in an external system (e.g. in the customer’s own IDP, or a secure identity from BankID, MittId, etc).

Basic requirements when developing this were:

  • Every step in the process must be secure. I.e. a malicious visitor should not be able to tamper with any data in any step to impersonate someone else.

  • An attacker must not be able to extract other people’s sensitive information, or be able to trick the system to leak information to unauthorized parties.

  • The puzzle system should not get/need access to any sensitive information from the external IDP, besides the user-info explicitly granted by the scope (and a pre-configured secret for code-exchange).

  • All communication to the external OIDC-provider/Authorization Server should follow OIDC standard to enable standardized integration and setup.

  • Any secure identity/user-info of a visitor should be presented to an Agent in a way that indicates that this information is reliable and can be trusted. I.e. the information is not something the visitors themselves has claimed, but it comes from a trusted source.

  • The flow should work in all modern browsers with the latest security policies

To achieve this, the visitor identification in Puzzel utilizes both types of flows described in the previous sections, and the Puzzel platform is designed to integrate as a third-party with any OIDC-provider supporting a silent explicit code-flow.

Any sensitive information like tokens, personal information, secrets, etc. is only sent in the back-channel and handled in a back-end service referred to as the “OIDC-proxy” in this document. The OIDC-proxy will take care of redirecting the visitor to the customer’s external OIDC-provider for authorization when an identification attempt should be done, it will exchange the code for an access-token and request user-info, temporarily store the fetched user-info in back-end and generate a “userInfoId” that refers to this data in the Puzzel platform. The OIDC-proxy will finally redirect the visitor back to the original URL from which the identification flow was started, with the parameter pzlUserInfoId=<GUID> appended to the URL if identification was successful, or the parameter pzlUserInfoError=<Error message> if secure identification was unsuccessful.

When the visitor web application running on the customer’s site receives pzlUserInfoId appended to the URL after an identification-flow, it can provide this id when starting a chat by calling Puzzel’s chat service. When a new chat is initiated, the chat BE-service will fetch the user-info data (previously stored by the OIDC-proxy) and add it as secure claims in the conversation data that is presented to the Agent.

The whole flow relies on top-frame navigations/redirects, and is protected by PKCE from start to end to prevent anyone stealing the authorization-code or pzlUserInfoId and successfully link it to a puzzle session, e.g. in a chat.

It is also crucial to explicitly configure which URLs should be allowed to receive the pzlUserInfoId after a successful identification-flow, to prevent an attacker from tricking someone else (out-of-bands) to perform an identification-flow and steal someone else’s pzlUserInfoId by forging a redirect to a non-trusted domain. This is a similar mechanism as (but not exactly) the redirect_uri-whitelist in the OIDC-provider configuration for explicit code flow.

To further reduce attack vectors and hide information from the front-channel during the OIDC-flow, it is also possible to enable PAR (Pushed Authentication Requests) in the SSO-setup.

Puzzel’s design goal has been to perform visitor identification as seamlessly as possible, and ideally, a logged-in visitor should not notice anything besides some extra browser-redirects before the chat starts. If (for any reason) identification fails, the default behavior is to start the chat as an anonymous visitor, and not display any error messages etc, that would disrupt the user experience.

Diagram of a full visitor identification flow in Puzzel web engage

The diagram below shows an example of a successful visitor identification flow triggered before starting a chat (without PAR enabled for simplicity).

  1. Visitor is logged in and his/her secure identity should be transferred to Puzzel, e.g. before requesting a chat. Visitor app generates a PKCE challenge/verifier pair and navigates the browser to the Puzzel OIDC-proxy. This request contains customer id, targetUrl (current URL, where we want to land after the flow), errorTargetUrl (if visitor identification fails, usually same as targetUrl, but could be a login-page) and the generated PKCE codeChallenge.

  2. The browser is redirected to the customer’s authorization server with the OIDC-parameters determined by the OIDC-proxy. The auth server will check if the visitor has a valid session, and if the configuration is set to prompt=none (recommended), it should always redirect back to redirect_uri and never prompt for login, etc. For setups where e.g. a chat should not be allowed to start if not logged in, prompt could be set to login to enforce logging in before chat. 
    Note: If PAR is enabled, this step will look a bit different. With PAR, the OIDC-proxy would instead have pushed the OIDC-parameters to the PAR-endpoint in the customer's auth-server in step 1 above, and the redirect-URL it sends back to the browser would have the query parameter request_uri. This is only an anonymous reference to the data pushed via the back-channel, i.e. the parameters and values are never exposed in the front-channel.

  3. Browser is redirected to the OIDC-proxy with a code or an error. It will store some state and redirect.

  4. Browser is redirected to OIDC-proxy endpoint /api/oicdcallback.

  5. If the OIDC-proxy received a code from previous steps, it will exchange it (using the Client secret configured in the SSO-configuration) to get an access_token.

  6. The access_token is used to look up userinfo from the customer’s IDP. The claims in userinfo is filtered and mapped according to configuration

  7. A userInfoId is generated and the data is tamporarily stored in Puzzel’s state-server (associated with userInfoId). The initial PKCE-codeChallenge is also stored with the data.

  8. OIDC-proxy finally redirects back to the original URL where the flow started (targetUrl), with the userInfoId added as the query-parameter pzlUserInfoId.

  9. Depending on Puzzel web engage configuration/setup, a new interaction (typically a chat-interaction) will be loaded by the visitor app. When the visitor application requests to start a new chat from CommSrv, it can provide the userInfoId + the PKCE codeVerifier with the request

  10. CommSrv tries to fetch userinfo from the state-server, it validates PKCE and sets the secure visitor userinfo claims in the conversation.

  11. Routing done. If no userinfo was included, or could not be fetched, it will still route the chat, but without any secure claims (i.e. the visitor will be allowed to start the chat anonymously)

When an agent picks up this conversation later, he/she will see all claims set in the conversation, and secure claims are marked with a green checkbox.

A similar flow as described above can be triggered during an ongoing chat. Either by an agent requesting identification by navigating the visitor to the login-page, or if the visitor e.g. navigates to a page that is available for logged-in users only. In this case, the userInfoId is not known when starting/routing the chat, and the visitor app will instead perform the OIDC-flow during the chat as soon as it detects that the visitor has logged in. After a successful identification during chat, the visitor app will POST the userInfoId and PKCE codeVerifier to /api/visitor/identity/<conversation id> to add the secure claims to an existing conversation.

In a similar way, if the visitor is detected to have logged out during an ongoing chat, the visitor app will call DELETE /api/visitor/identity/<conversation id>. This removes the “secure”-flag from the claims, i.e. the claims will still be visible to the agent, but the green checkmark will disappear to indicate that the claims are no longer secure.

4. Configuring Puzzel visitor identification

The configuration of Puzzel visitor identification can be separated in two different parts: The visitor SSO configuration and the web engage configuration.

The SSO configuration is the technical configuration needed for the OIDC communication between Puzzel’s OIDC-proxy and web application, and the customer’s OIDC-provider and IDP when an identification flow is performed.

The web engage configuration determines how visitor identification should be integrated with the customer’s web page and Puzzel’s web application(s) running from them. I.e. it’s the place to configure “when and how” an identification flow should be performed, and what to do with the result.

4.1 Prerequisites

Puzzel secure visitor identification requires that the customer’s IDP can handle OIDC (OpenID Connect).
It needs to support the silent code-flow (i.e. the authorize endpoint should support “prompt=none” and “response_type=code”). The secure claims that will be presented to the agent when a logged-in visitor is identified needs to be present either in the access_token or in the response from the userinfo-endpoint in the customer’s IDP.

4.2 Configuring the customer's IDP settings

To be able to identify visitors in the customer’s own identity provider, there are som configuration that needs to be done in their IDP/OIDC-provider:

  1. Create a new client_id for the Puzzel client.

  2. Ensure that the IDP's authorize-endoint supports silent code-flow (i.e. the request parameters prompt=none and response_type=code)

  3. Create a client_secret for the client_id created in (1) above. The secret should allow an auth-code from an OIDC code-flow to be exchanged to an access_token for fetching data from the IDP’s userinfo-endpoint.

  4. Determine which claims should be available/displayed to the agent, what each claim should be labelled in the agent chat UI, and determine which claims contain personal/sensitive data (PII) that should not be archived e.g. due to GDPR.

  5. Determine which scopes are needed for the Puzzel client to get access to all the claims determined in (4) above (e.g. openid, profile, email). Claims can be extracted from the access_token or from the userinfo-endpoint.
    Note: The scopes listed for visitor identification must be a subset of the scopes specified when logging in. I.e. there must not be any additional scopes listed here that were not granted in the login-flow.

  6. Add the Puzzel Callback URL to the redirect_uri-whitelist for the Puzzel client.
    The callback URL is generated when creating a Puzzel Visitor SSO configuration (described below), and it should have the following form:
    https://app-consumeridp.puzzel.com/federation/{Schema}/signin (Puzzel’s production env)

  7. Determine from which domains/URL(s) on the customer website an identification-flow should be allowed. This is typically the start of the URL from where chats may be started, which is often simply the domain of the customer website. E.g. "https://www.website.com" would allow identification flows to be performed on any sub-page of https://www.website.com.
    This Target URL whitelist has a similar function as the redirect_uri-whitelist in the IDP (as described in (6) above), but it needs to be more relaxed (the matcher will allow any URL starting with any of the specified target URLs).
    This is because the final outcome/result of an identification-flow needs to be passed to any URL on the website from which an identification may start.

4.3 Configuring Puzzel Visitor SSO settings

The Visitor SSO can be set up here:
https://app.puzzel.com/settings/

A link to this page should be available for administrators with the correct authorizations by opening https://app.puzzel.com/admin/, click the icon in the top-right corner to expand the menu and click “Organisation Settings”.

Scroll down to the section “Security”, “Visitor SSO” and click “Configure”:

Use the “Add” button to add a new Visitor SSO configuration

Add/edit Visitor SSO configuration

 List of claims to display

Add/edit claim mapping

  • Puzzel "Solution ID" for which this config should be used. Each Solution ID can have multiple Visitor SSO configurations.

  • "Display Name" for this configuration. The display name is used to identify the configuration when selecting which one to use in the identification interactions (described later in this document).

  • "Type" is currently always OIDC (not editable). Other types may be supported in the future.

  • "Authority": The base URL of the customer's IDP, e.g. "https://login.microsoftonline.com/aa198d57-8eb0-435c-a6e3-86e15e8490a9/oauth2/"

  • "Discovery endpoint": The full URL to the IDP’s discovery endpoint, e.g. 
    https://login.microsoftonline.com/aa198d57-8eb0-435c-a6e3-86e15e8490a9/v2.0/.well-known/openid-configuration

  • "Target URL whitelist": A list of base URLs from which an identification flow should be allowed. See description in 4.2 bullet 7 above. Important: Only add trusted URLs/domains here. A malicious user controlling the domain could exploit it to impersonate someone in a chat.

  • "Client ID": The created "client_id" from (4.2 1) above.

  • "Client secret": The created "client_secret" from (4.2 3) above.

  • "Scopes": All scopes needed from (4.2 5) above to get all desired claims.

  • "Get claims from UserInfo endpoint": If all the desired/mapped claims are not present in the tokens, this setting should be enabled to fetch userinfo in an additional request. Note that appropriate the scope needs to be added to allow accessing the userinfo endpoint.

  • “Enable PAR”:

  • Add all claim mappings from (4.2 4) above.
    Note:
    Only claims added to this mapping will be visible to the agent.
    For each mapped claim, the following info should be configured:
    - "Key": The name of the claim in the IDP (either in tokens or in the userinfo)
    - "MapType": One of "ChatId", "NickName", "Variable".
    Only one claim-mapping of type "ChatId", and one of type "NickName" can be set.
    All other mappings need to have "MapType": "Variable".
    “ChatId” and “NickName” have special meaning in the Puzzel platform, and is used e.g. to create
    more readable content in archived chat logs, etc.
    - "Description": What the claim will be labelled in the agent chat UI (e.g. "First name", "E-mail", "pnr")
    - "PII": "Yes"/"No".
    This setting determines if this claim is to be regarded personal/sensitive information.
    Claims with "PII": "Yes" will not be archived, alt. have the value masked in archives chat transcripts
    for GDPR compliance (e.g. "pnr", etc).
    Claims with "PII": "No" will be stored in plan text with archived chat transcripts (e.g. "First name")

4.4 Web Engage configuration in general

This section is a brief description of how Puzzel Web Engage configuration works in general.

Setting up a web engage configuration typically involves creating rules that matches specific pages or sections of the customer website, and creating and linking a series of interactions together to build an interaction chain.

A simple, but very common setup is a rule that matches all pages on the website.
When the rule matches, it will trigger and launch a “Panel”-interaction during office hours (i.e. when the customer support is open), or an “Information panel” during off-hours.

This “Panel”-interaction may contain a form and an “I want to chat”-button that is linked to a “Chat”-interaction. The “Chat”-interaction is often linked to a “Post chat”-interaction (where a visitor can download a pdf of the conversation, etc), and there may be a “Survey”-interaction as a last step in the chain.

This setup forms a rule and an interaction chain that makes a panel (or just an icon in the corner) appear to visitors on every webpage during office hours. If the visitor expands the panel, optionally fill some form and clicks “Chat”, the Puzzel UI will transition to the “Chat”-interaction.
And when the chat ends, the application transitions to “Post chat”, and finally to the “Survey”-interaction, and after that the chain is finished.

Such interaction chain could look something like this:

4.5 Web Engage configuration with visitor identification

This section describes how to add visitor identification to an interaction chain - i.e. how to control exactly how and when an identification flow should be attempted.

Since a visitor identification OIDC-flow involves a series of browser redirects, it can be considered an “expensive” operation (that may possibly interfere with the visitor’s user experience). So we usually do not want to perform the flow unless we need to.

To add visitor identification to the above example, a “Visitor identification”-interaction is typically inserted in the interaction chain before the chat-interaction.

A “Visitor identification”-interaction is a logical building block that does not have a visible UI. It contains configuration parameters to control how an identification flow should be performed, and it should be placed in the interaction chain to control exactly when it should be performed. The chat interaction is currently the only interaction type that can utilize secure visitor identification data, so in a typical setup it should be placed it immediately before the “Chat”-interaction.

If the identification-interaction is placed earlier in the chain (i.e. as first interaction, before “Panel”), it would be triggered on every page-load during office hours, which would cause a lot of redirects, bad UX for the visitor and high load on the OIDC-proxy and IDP. And if placed later in the chain (i.e. after “Chat”), the visitor’s identity will not be picked up until after the chat has ended.

The updated interaction chain would look something like this:

5. Optimizing Visitor Identification

As mentioned in the previous section, an OIDC identification flow is usually regarded to be an “expensive” operation, which means that we typically want to avoid performing it unless it’s needed.

Unnecessary identification attempts can be avoided by properly structuring the rules and interaction chains. E.g. no rule that matches a lot of pages should directly trigger an indentification interaction. Usually, there should be some user interaction before attempting to do an identification flow (e.g. a visitor actually wants to chat and clicks a button in a panel, etc).

But even if an identification interaction is properly placed immediately before chat, there are still situations where it is useless to perform an identification flow. E.g. if we know that the visitor has not logged in, and the identification flow would surely fail to identify the visitor, but we still want to allow starting a chat as an anonymous visitor.

For determining whether a visitor is currently logged in or logged out, there are two other (logical) interaction types available: “Detect visitor login” and “Detect visitor logout”.

Their only purpose is to detect and remember if a visitor is likely logged in or logged out, and this state can later be used in an identification interaction to e.g. skip the flow and go directly to chat if the visitor has never been detected as logged in (or was loggen in, but logged out again before trying to start a chat).

The typical setup when using the “Detect visitor login”/”Detect visitor logout” interaction types is to create two new, separate rules that matches some browser criteria associated with being logged in or logged out, and triggers these interactions. These rules are typically matching a login landing page URL or a logout landing page, or some DOM elements visible only when the visitor is logged in or logged out.

The logged in/out state registered by these “Detect”-interactions can then control how the “Identification interaction” in the main chain (the one starting with a rule that leads to chat) should behave later when launched. If its configuration for Perform identity check is set to If login has been detected, the Identification interaction will only perform an OIDC-identification flow if the visitor has previously triggered “Detect login”. If the visitor is not detected as “logged in”, the identification interaction will just transition to the next interaction in the chain (unless the setting Continue if identification fails is disabled).

Rules triggering interaction types “Detect visitor login” and “Detect visitor logout” as first/only interaction will always be evaluated, independently of any other rules. The rules could match specific URL-patterns, like login- and logout-landing-pages that visitors will always be navigated to when logging in or out. Or they can match specific elements in the DOM of the page (e.g. if a button with the text “Logout” is visible on the page, the visitor is probably logged in and the rule triggering the “Detect login”-interaction should trigger). Or if an element with e.g. the css selector “#loginForm” is visible in the DOM, the visitor is probably not yet logged in, so the rule with the “Detect logout”-interaction as outcome should trigger.

An example of rules and interactions set up with Detect visitor login/logout-interactions:

Instead of setting up rules and “Detect”-interactions to register a visitor’s logged in/out state, it is also possible to use the visitor front-end API to set the state from a script by calling:
pzl.api.setLogingDetected(<true/false>)

This will control how an identification interaction will behave later on, exactly like triggering the “Detect”-interactions from engagement rules.

6. Visitor identification during an ongoing chat

A common scenario is that any visitor should be able to start a chat, regardless of being logged in or not. If the visitor is logged in before starting the chat, an identification flow should be performed to get his/her identity and set it in the conversation. And visitors that can’t be identified should also be allowed to start a chat without any secure visitor claims.

If a visitor starts a chat as non-identified, but logs in during an ongoing chat, the Puzzel chat application can set secure claims in an already started conversation. This is triggered by a “Detect visitor login”-interaction being launched (by a rule), or an API-call to pzl.api.setLoginDetected(true); I.e. if there is an ongoing chat conversation, the visitor is not already identified, and the login state flips to “logged in”, an OIDC-flow can be performed during chat.

For this to work, there needs to be an “Identification interaction” somewhere in the chain before chat (which there normally is to detect already logged in visitors before starting the chat). When identifying a visitor during chat, the settings from this identification interaction will be used.

So even if (for some reason) you never want an identification flow to be performed before starting the chat, there still needs to be a configured “Identification interaction” somewhere in the chain (before chat) for identification during chat should work. If its Perform identity check is set to Manually/On demand, identification will only be triggered during chat. 

Published

Last updated

0
0