Next Steps, Not Flow, Is All We Care

Sean Wu
6 min readMar 6, 2024

--

All GUI interactions can be defined by a flowchart. But is flow enough for conversation design? It has long been suspected that flow is inadequate for building conversational experiences. Unfortunately, since there are no better choices available, so every conversation designer still uses it. This sentiment is captured well in a great discussion started by Kane Simms two years ago.

The majority of that discussion occurred when the shallow NLU model was the most challenging aspect of chatbot development, commanding a lion’s share of brainpower for everyone involved. This might be the reason why some key insights were raised in the thread but not fully developed. Now that dialog understanding is solved by LLMs, we can turn our attention back to this age-old interaction design question: “flow or not flow.” But first, let’s recap that thought-provoking thread.

Flow is not enough

There is a fundamental difference between application with graphical user interface (GUI) and conversational user interface CUI: with GUI, users can only interact with your application through pre-defined pathways; for instance, they cannot specify a destination on the date-picker page. Developers have full control over the interaction, so they only need to specify the desired behavior for these predefined interaction paths to create a usable GUI experience.

“Flow-based logic, such as ‘if this then that’ flows, is suitable for building something straightforward, like a game of survival, where pre-determined flows suffice. However, conversations don’t follow pre-defined journeys often. As soon as you begin altering elements mid-conversation, like changing a pizza order from margarita to ham and mushroom, but then deciding to add extra pepperonis and revert the change, the system begins to break down. Managing this complexity becomes challenging, as conversations inherently possess infinite complexity. Additionally, testing such flows can prove difficult.” In Kane’s words (first speech to text, then checked by ChatGPT).

The need to change their idea in mid-conversation is one reason why users can not stick to the happy path you designed for them. Sometimes, users might also need to take a detour, interacting with the chatbot to obtain side information or handle other tasks before returning to their original objectives. In conversation, users are accustomed to freely expressing their needs, so the chatbot needs to be able to react to any reasonable input from the user for the user to feel interaction is natural or usable. Unfortunately, attempting to explicitly cover all possible conversation paths using flow is a hopeless endeavor, as explained by more than one people in the thread.

Then why every conversation designer still use it?

In that thread, even when many conversation designer knows the limitation of flow, most of them will admit that they still use it. But why? Because:

“For all the criticism flow diagrams receive, I’ve never seen a viable alternative when it comes to the design process.” Says Matt Buck.

And it is not like we have not tried to find the alternative.

“ We built an entirely new tool in summer of 2019 to test the concept of no paths/lines as a way to learn about the best framework. What we realized after a year of engineering on that new CUI platform was that lines are representations of dialog context and as long as they aren’t hard rules for the conversation, they are the best way to visualize context.” Says Braden Ream.

Indeed, the business process can and will branch based on context. To expose such processes through conversational user interface, there are bound to be dependencies between the components of the conversation. For example, “ you shouldn’t be able to complete a purchase if you haven’t placed an order.”

In the end, the suggestion is: we should use flow to visualize the dependency (or context), but implementation needs to take that visualization and does the right thing. But is this possible? Is flow an efficient encoding of dependency?

Next steps is all we care.

To answer that, let’s revisit the nature of conversation design, particularly interaction design. And it was put perfectly as follows:

“Conversations shouldn’t ever need flows, only an idea of a next step. As in, once someone has x question answered, what might they need next?” By Kane Simms.

In other words, at the interaction level, conversation design revolves around determining which dialog act to communicate to the user based on the conversation history. Businesses need to construct deterministic chatbots to ensure consistent and reproducible user experiences, as well as for compliance purposes. Therefore, all we truly care is the ability to define the conversational behavior of the chatbot, using some ‘if-this-then-that’ rules, where ‘this or antecedent’ represents context and ‘that or consequent’ denotes the dialog acts that the bot needs to communicate to users under that context.

Given a set of rules defined by the designer, the bot simply iterates through each one of them, checking if the antecedent (this) of a rule matches the context implied by the conversation history and current business conditions such as inventory. When a match is found, there are actually two different strategies toward whether or not executing the corresponding consequence (that) immediately:

  1. “Always Do That” (Sufficient Condition): When the condition of a rule is met, its action will always be executed, without further consideration. There is no room for chatbot to make smart decisions, so designers have to specify everything.
  2. “Can Do That” (Necessary Condition): When the condition of a rule is met, its action can be executed, but execution is not mandatory. This provides more flexibility to the chatbot, as it may consider multiple rules whose conditions are satisfied and then select the most appropriate action based on additional criteria or context. Designers don’t need to exhaustively specify every possible condition; instead, they define rules based on essential criteria, allowing the chatbot to make more autonomous decisions

Clearly, we want to adopt a necessary condition system to reduce the designer’s effort level while creating the desired conversational experience. But adopting such a system does not automatically give us that benefit; we need a good set of rules. Ideally, the antecedent of a rule should be minimal, meaning it should encode only the essential dependency of its action as a condition. Including irrelevant dependencies as conditions will split the condition space and end up requiring more rules to cover the remaining cases.

For example, if (REQUEST quantity) is only dependent on “quantity-is-missing”, but you add “price-is-notified” as an additional condition to that rule, then you need to add another rule with the same action. This time, however, both “quantity-is-missing” and “price-is-not-notified” are included as antecedents in order to achieve the same result.

Flow antecedent is not minimum.

A conversation flow, or conversation path, is simply a sequence of exchanges between the user and the chatbot. Flow, whether linear or non-linear, can be readily converted into rules that define chatbot behavior: for every system turn at K, we create a rule where the exchange sequence on this path up to K-1 serves as the ‘this’ or antecedent, and the dialog act emitted by the bot at that turn serves as the ‘that’ or consequent.

Unfortunately, any flow antecedent with a length greater than one is not minimal. This is because the bot does not know the essential reason for emitting a dialog act at turn K. Is it because of the user’s dialog act at K-1, or some dialog act before that? Or some combination of what has happened so far? Even when turns are just shuffled, it is not clear whether the action should still apply. Since the bot can only execute a rule if the current conversation exactly matches the conversation flow defined by its antecedent. Any deviation from that flow antecedent will render the corresponding rule useless. So the longer the flow in the condition, the less useful (as it covering less paths) the rule will be be. As long as we can define the context and corresponding desired next step, we are fine.

Parting words

The flow chart brings two things to the table: an easy-to-digest visualization tool and a way to encode the dependency between interaction components.

Visualizing conversation behavior explicitly, which could consist of exponentially many conversation paths, will always be mathematically intractable, regardless of the tool used. But if the goal is to build the conversational experience, do designers really need to visualize the desired behavior?

Utilizing flow to encode dependencies isn’t efficient, as it often results in non-minimal rules. However, is encoding dependency for conversation design also an intractable problem? Is there a more efficient method to encode essential dependencies, allowing designers to specify desired behavior with less effort while still covering all possible user interactions? Fortunately, the answer is yes.

--

--