Defining a new 'on the go' voice assistant
This project was proceeded by a previous research engagement, which you can read about here. Following our concept showcase presentation, our stakeholders had given us the green light to begin validating our two concepts through a make-to-learn project which consisted of validation testing and rapid and iterative prototyping. We would have 7 weeks to take these concepts and interrogate their value propositions through iterations of designing prototypes, testing with users, and incorporating our learnings. At the kickoff of this project we also restructured the team with two researcher strategists rolling off, and two engineers rolling on.
The Smart Helmet & RideAlong App
The SmartHelmet is not just a tech-laden safety helmet but is also enhanced by software-powered experiences.
Multimodal HMI for shared micromobility services
A first-of-its-kind HMI for OEMs and/or service providers unlocking a range of beneficial features to both rider and concerned policymakers.
Customer profile
In the first phase of this project, we developed a customer profile of our customer segment, the “Micromobility Rider”. This profile helps us understand who we are creating value for, and what their highest priority needs are.
Top Jobs
- Get from A to B quickly
- Navigate to destination
- Actively maintain control of the vehicle
Top Gains
- Freedom, full control
- Easily accessible
- Fun & wellness
Top Pains
- Requires full attention
- Feels unsafe
- Difficult to navigate
We then broke down our customer segmentation further to ensure that we were recruiting a diverse group of participants for our research. We wanted to ensure they'd have differing motivations and comfort levels when it comes to riding micromobility vehicles.
Fitness Rider A rider who has invested time and money into riding for the health benefits of cycling.
Casual Rider A bike owner who rides regularly for commuting, recreation, and/or utility.
Shared Rider A rider using shared services who has unique experiences compared to bike owners.
New Rider A rider who has limited experience with micromobility or rides only infrequently.
Hypothesis-driven validation
We used a hypothesis-driven validation methodology, which is a systematic approach to validating an idea by breaking it down into testable hypotheses in order to gradually reduce risk through evidence-based decision making.
To figure out where to start in our validation testing I began by identifying the assumptions we were making in our value proposition(s). These assumptions are what we need to be true in order for our value proposition to meet the needs of our users and be desirable to them.
I reframed these assumptions into hypotheses using We Believe That statements and then prioritized them by criticality and risk. This began with identifying any potential 'idea killers'–any assumptions that we would want to begin testing and validating immediately. These hypotheses might include risks of desirability, feasibility, usability or viability.
Test design & Evaluating prototyping approaches
After identifying what we needed to validate I wanted to take some time to understand which approaches would help us learn the most, in the shortest amount of time. If we don't need to develop functional code-based prototypes, we shouldn't. At the same time, for some of our hypotheses, especially those around the spatial rendering of audio, we would need to ensure a level of experiential fidelity. Below are the types of prototypes that I evaluated in this process.
After this initial evaluation, and running a scoping session with our engineers, I selected two prototyping methods that would give us the flexibility to validate our assumptions and provide the highest degree of experiential fidelity.
WoZ prototype of Voice assistant with Spatial Sound Rendering
WoZ prototype of Voice assistant with Spatial Sound Rendering
First Prototype
WoZ (Wizard of Oz) prototype with spatial sound rendering
In our participant screening, we selected cyclists with a moderate-high degree of experience and had them sign a waiver to confirm their competency as a cyclist. The road tests would place our users into a real-life traffic context in order to test the desirability and usability of our smart helmet and smart bicycle concepts.
Typically for lean prototyping of a conversational interface, I would use a 'Wizard of Oz' approach and design with Simili, a design and WoZ testing tool we had developed at Connected. This project presented a new challenge for me though as our product concept was an on-the-go product with novel spatial audio capabilities, the desirability of which we needed to validate. In order to really validate our value proposition, I would need to figure out new approaches to prototype and test.
Spatial Audio & acoustic AR
The smart glasses and smart helmets we were designing for have the ability to render audio spatially using data from an integrated magnetometer and gyroscope. By combining this data with GPS location we were able to explore a kind of acoustic AR with a 360° soundstage. We were curious if it had application to any of our navigation or entertainment experiences.

Route & dialogue design
Between our two prototypes, we tested 10 different assistant features based on our hypotheses in both concepts 1 & 2. We tested participants on bicycles as well as electric scooters across a predetermined 4 segment route through the entertainment district of Toronto and along the waterfront.
- Conversational Interface
- Turn by Turn Navigation
- Tailored Routing
- Proactive Warnings
- Personalization
- Communication
- Music & Entertainment
- Fitness Tracking
- Spatially Rendered Audio
- Tour Guide
WoZ prototype hardware setup
The test setup was quite unique and complex so I'll take a moment to unpack how it was set up. The tests were conducted by a road test liaison in partnership with myself, acting as the remote operator. The liaison facilitated our tests with the participants, riding with them between waypoints in the city, and conducted a short interview with them between each test. As the remote operator, I stayed in our office, operating our WoZ prototypes using a tool we developed specifically for this project.
I was able to listen to our research participant and their interactions with our voice assistant using a microphone attached to their helmet. I could then trigger our voice assistant's responses with a web application we developed. The web application would communicate with the mobile application which would synthesize speech locally on a mobile device stowed away in a small backpack we had each participant wear during our road test. This synthesized speech and media would playback on a pair of smart glasses.
As the operator, I was able to listen to the participant's voice on a one-way call, while simultaneously on a two-way call with the liaison. To augment this line of communication the participant's location was being broadcast to me with real-time GPS tracking, providing me with the additional context that was essential in order to effectively operate a successful WoZ test.
Spatial Audio WoZ prototyping tool
In previous work, out of necessity, I'd already developed a WoZ design and testing tool, Simili, and initially, I considered using it to help us with our prototyping. I realized quickly however that we would need to develop new capabilities for it, and this might take too long. Alternatively, we could develop something entirely new. After scoping both options and for the sake of velocity we went with the latter and designed and built something extremely lightweight, and including only the functionality we needed.
I needed a testing tool that would help us to validate some of our assumptions regarding spatial audio, acoustic augmented reality, and location-aware conversational agents. I hypothesized that by combining the audio augmented reality SDK our client had developed with an open-source library I had found from Google called Resonance we would be able to do the real-time spatial audio rendering we needed. Our engineering pair was able to build a very small proof of concept that this would work to validate this approach was feasible and out of necessity, and we went forward and developed this prototyping software in less than two weeks, which required a lot of quick problem solving and prioritization to accomplish.
Instead of designing the entire front end of a WYSIWYG (what you see is what you get) design tool, I discovered that we might be able to just use Google's My Maps as a design canvas. This was ideal because I had already been using it to plot our test segments and design our prototypes to be location context-aware. I discovered that from My Maps I could export a KMZ file, which I could then convert to a CSV file. This allowed me to design and export files for the dialogue and import them into our own testing software. The CSV files contain the dialogue for our conversational agent as well as geolocation coordinates, media file names, and directional angle values for spatial rendering.

This is how it works:
The operator imports into the operator's test console a CSV file that loads the preconfigured prototype design, visible as a 'list of sounds' as well as anchored within a Google Map. Each prototype can consist of dozens of geolocated sounds and conversational prompts. A connection is then established between the web client and the companion Android application on a device carried by the test participant. The Android application is also sending the device's GPS coordinates to the operator's client and their location is updated in real-time for the operator to monitor within the test console.
The operator can toggle between an automatic or manual mode. In automatic mode, the audio prompts will trigger when the cyclist enters a tight radius to the geolocated prompt. In manual mode, however, the operator is responsible for triggering each prompt, as in a traditional WoZ test. In order to make manual mode easier, prompts that are nearby the test participant are dynamically suggested to the operator, making finding the correct ones quicker. If necessary, the operator can also improvise and type responses, which proved invaluable when our participants made unexpected turns, or when we wanted to reflect real-world conditions, such as the rider's speed or the weather.
WoZ operator console demo
Spatial audio and sound design
In addition to text string for speech synthesis, the triggered prompts can include the names of audio files, as well as angle degrees from true north. Since there are no fields for angle degree or attaching audio files in the design tool–Google My Maps, I established a text formatting in the description field that I was already using for text-to-speech dialogue. To distinguish these values we used predefined special characters which could then be parsed by our companion Android application.
In the title field, I would use a dollar symbol proceeded by a value representing the degree angle from true north that they want the angle to play from. I'd also indicate the playback of a sound file using $sound in the description field followed by the filename. All of the earcons or music files I used in the prototypes were stored on the device and can be accessed by the companion Android app when needed for spatial rendering.
To create spatial audio the AR software development kit provides the ability for us to connect the Android application with real-time positional data from a pair of Smart Glasses, and we use the Google Resonance library to reference that data and render all our audio spatially at the desired angle. Multiple sounds can be overlaid at different angles. This allows us to playback earcons (sound UI) and synthesize speech on a 360-degree soundstage. We found contrasting angles make the 360 soundstage more evident. However, sometimes normal stereo is more suitable. When no angle is selected, the sound is synthesized as head locked stereo (standard stereo).
Second Prototype
Immersive VR video with ambisonic spatial audio
In our road tests, we were getting a lot of mixed feedback from participants about the perceived value and effectiveness of spatial audio, including whether it was even very noticeable. We wanted to understand how we might be able to use it more effectively, and needed a controlled and safe way to explore, experiment and validate our designs.
To help us learn, I decided to build a VR prototype, it was an option I'd highlighted earlier in my test design process and because of the inconclusive feedback from our road tests regarding spatial audio, we decided to go forward with it to learn more. Our VR prototype uses VR180 video format. A 180-degree field of view, with stereoscopic depth. I used a specialized camera to record this VR video in and around various cycling routes in Toronto while riding an electric bike.
This video was mixed with spatial ambisonic audio. Ambisonic audio formats and VR180 video are both supported by Youtube, and when viewed with an Android device and Google Daydream VR headset, can reproduce the Smart Glasses spatial audio experience with a high degree of experiential fidelity. The primary purpose of these tests was to understand the desirability and usability of spatially enhanced audio cues and allow us to conduct them in a controlled environment.
We ran this test with 11 participants in 5 different scenarios, which included prototypes that I'd designed with side-by-side comparisons with Google Maps’ turn-by-turn directions with our turn-by-turn directions enhanced by spatial audio cues. We also tested 'tour guide' features that took advantage of spatial audio causes to provide branching navigation and points of interest throughout the city.
VR prototype setup
In order to increase the experiential fidelity of this prototype, I incorporated a stationary bike that our participants would ride while viewing the VR prototype. Using screencasting the facilitator was able to view whatever the participant was seeing on a nearby monitor.
VR prototype demo
Below is a demo of one of our tour guide prototypes. Youtube supports VR180 video and first-order ambisonic audio so If you wear headphones while viewing this video you'll be able to hear the ambisonic spatial audio. Use your mouse to simulate your head movement and you'll hear the audio pan accordingly to keep the soundstage in the correct orientation.
Spatial audio for exploration
In this tour guide scenario, the rider experiences a guided tour of Trillium Park. Points of interest are highlighted during the tour and the user is able to ask questions about the park in an intuitive way because the assistant has the directional context of the user's head position. This allows the assistant to understand "What kind of trees are those", and understand which ones the user is looking at.
Spatial audio for navigation
To help us measure the effectiveness of spatial audio in enhancing navigational prompts and lowering cognitive loads for riders. We tested multiple variants of an experience where the voice assistant was providing the user with turn-by-turn navigational prompting, and several versions which used spatial audio in different ways, and contrasted this to our control.
Spatial audio design principles
I designed our VR prototypes not only to validate that spatial audio had value for micromobility riders but also where and how spatial audio can best be applied. This resulted in a set of applied principles that help users more intuitively distinguish and interpret spatial audio cues.

Validating HMI Hardware
We faced a lot of uncertainty when it came to hardware, as a result, we tested several formats, systematically switching throughout our road tests while documenting user feedback.
Shoulder worn directional speakers
Smartglasses with integrated spatial audio
Mounted Bluetooth speaker
Smart helmet with integrated directional speakers
Road tests
With each participant, we conducted a 2-hour session where they were taken through several scenarios during road tests on bike share and electric scooters. This allowed us to validate and invalidate our various hypotheses about our concepts hardware and software features and in a real-world setting. Each session included a half-hour context interview, followed by a 1.5 hrs road test that included four different prototypes, between each prototype was a five-minute experience review interview.
Road test highlight reel
Research Synthesis
Throughout our testing, we transcribed our recordings and atomized our research into an insights tool called Optimal Workshop, which helped us with our analysis and research synthesis. We also made ample use of our physical workspace.
User Research Insights
Tailored Routing
When asking for directions participants were presented with a choice through a conversational interface that asked them to make a value judgement between two routes that prioritized either safety or time. Having routing choice is something users typically have with screen-based navigation products like Waze or Google Maps but is not available on any voice-based products.
Navi can help riders find the route that suits their individual needs.
Insights
- In our survey, we learned that riders make routing decisions based on numerous internal and external factors, and rarely take the entire default route provided by Google.
- Empowering riders to make informed routing decisions will increase riders’ confidence, control, and feeling of safety.
- Participants unanimously rated this feature as highly desirable.
Recommendations
- Parameters to explore with tailored routing may include Safe, Direct, Scenic, Low-Pollution, Elevation etc.
"I really liked that, it depends on the time of day [...] midday I know it's going to be a bit more busy so I might take the safe route. You get that option in the car when you use GPS to choose different routes so it's good to have it here too" Peter, 40
Turn-by-Turn Navigation
Any navigation software today is expected to have turn-by-turn directions and as such consumers have high expectations for directional prompts to be both timely and accurate.
Participants in our testing were supported by Navi giving them timely audio-based Turn-by-Turn directions. By asking riders directly, Navi can personalize the frequency and detail to a user's unique needs.
- Turn-by-Turn navigation via audio reduces cognitive load, allowing riders to focus on the road
- Turn-by-Turn is an industry standard and users didn’t see it as a major differentiator
- Riders have high expectations for directions that are accurate, intuitive and timely. If prompts are not accurate this feature loses value quickly and significantly.
- The desired frequency of directional prompts is typically tied to both a user's experience level and whether they’re riding in a familiar or unfamiliar area.
Recommendations
- Accurate, intuitive and timely directional prompts are essential
- Personalized prompting based on rider data (comfort, experience, location history, preferences)
- Avoid using cardinal directions, use conversational language.
- A focus on good sound design and audio first cues can help differentiate our client
"It's definitely something I would be using, it’s better than having a phone in there, you don't even need a map, it just tells you where to go." Alina, 32
Proactivity
On the road, riders will encounter a wide range of unexpected obstacles and conditions such as infrastructure changes, road construction or weather conditions.
Our bikeshare participants were presented with payment information updates and a notification about their destination bike station filling up (it’s docked) and given an option to reroute.
Navi provides riders with dynamic and relevant information about things like road conditions, construction, or infrastructure.
- Proactive, contextual warnings allow riders to make informed riding decisions while en route.
- When riders can visualize the path ahead and anticipate changing conditions, they ride with more confidence and feel safer.
- Proactive warnings were ranked as extremely valuable by all participants and seen as a major product differentiator.
Recommendations
- Collect data (from open-source map platforms, public databases, crowdsourcing, etc) to provide proactive warnings. This will be inconsistent and will be difficult to scale.
- Augment this data set by establishing a crowdsourced reporting feature similar to Waze navigation, creating live up-to-date condition reporting.
- Personalized prompting based on rider data (comfort, experience, preferences)
"During the navigation, there were prompts about the bike lanes ending or that there's construction up ahead, so I'll change my route or turn off if I'm not comfortable. I really like to know what's going on in advance."Peter, 40
Additional learnings & Next steps
While our focus was on the core features of navigation and safety we also explored some additional use cases and learned that across the wide range of riders, meeting some of these lower priority jobs can create motivating gain creators. Fitness tracking, music and tourism all provide interesting avenues to create additional value for some customer segments.
Throughout our testing, we also gained valuable insights into hardware. Insights that helped highlight potential feasibility risks, and enabled us to make evidence-based recommendations on potential integrations with our client's existing product line, in addition to the HMI opportunity that the rise of shared bike and scooter systems were presenting.
Our client selected us to continue this project further into hardware development and a functional code-based prototype. An additional and surprising outcome of this early prototyping and experimentation was the identification and filings for two patents.
Our product owner had recognized there was a wealth of interesting concepts, some of which were tangential to our work but interesting nonetheless, so he provided us with a form to submit concepts. Two ideas that I had submitted were selected by an innovation board as being novel and patent-worthy. Subsequently, after this phase of the work, I was asked to collaborate with a team of technical writers and a legal team to develop the ideas further, document them, and file for two US patents. To give myself time to do this I was moved off the project, but continued to act as an advisor during the next phase of work that involved the development of a functional prototype, an multimodal, voice-first HMI for micromobility users.