Introduction
Sacks, Schegloff, and Jefferson (1974) proposed a model of conversational turn-taking with two components: turn-construction units (sentences, clauses, phrases, or single words ending at transition relevance places) and turn-allocation techniques (current-speaker-selects-next or self-selection). This locally managed, interactionally controlled system accounts for fourteen observed facts about conversation, including a) smooth speaker transitions, b) brief overlaps, c) variable turn sizes and orders, and d) repair mechanisms for violations. Their model differentiates conversation from other speech exchange systems like debates or meetings, which feature pre-allocated turns. The theory conceptualizes turn-taking as simple rules applied at each transition relevance place, creating a framework that explains the orderliness of spontaneous conversation despite its contextual variability.
In a later study, Levinson and Torreira (2015) addressed a core psycholinguistic conundrum: despite language production requiring 600-1500ms processing time, conversational transitions typically occur with gaps of only 100-300ms. Why? Predictive comprehension cues listeners to simultaneously process incoming speech while planning their responses. Speakers begin formulating their turns mid-way through the current speaker's utterance, identifying the speech act early and preparing articulation that is held in a buffer until turn-final cues are detected.
This model accounts for the timing distribution in natural conversation, including both short gaps and overlaps. Experimental studies support this predictive comprehension explanation, showing that human listeners use syntactic, semantic, and prosodic information to project turn endings, and children develop these predictive abilities progressively through language acquisition. In other words, spontaneous conversations rely on a three-cueing system for efficient and organized turn transitions. These studies of human conversational performances reveals surface similarities with computational language, but at a deep level human conversations between humans differ profoundly from human-machine conversations.
Early Machine Conversations: Mental Illness and Paranoia
The first chatbot, ELIZA, developed in 1966 by Joseph Weizenbaum, marked the initial intersection of conversational analysis and bot technology. ELIZA simulated psychotherapy sessions acting as a psychotherapist by returning user utterances in a question form by finding a Keyword in the input, decomposing the syntax of the input utterance, and reusing the parts to transform output around the Keyword. For example, if the patient uttered “I need help to get along with my mother,” the word “mother” would trigger a Keyword “family,” and ELIZA would respond “Tell me more about your family.” While primitive, it introduced the idea of studying interactions between humans and machines. Again, at a surface level, this pair of utterances could be human. Knowing how they were produced, however, no one can be fooled.
Chatbot PARRY (1972) incorporated slightly more sophisticated natural language models, enabling them to imitate human-like conversational patterns, which was reported in an article titled “Artificial Paranoia.” The first chatbot to pass the Turing Test, PARRY demonstrated its ability to simulate coherent dialogue with a user who acting the part of a paranoid man. The only individual who could talk with PARRY would be a person acting the part off a paranoid man. Training for PARRY was limited to a single set of generalizations and rules about the paranoid individual who would enter into the conversation.
The task of the bot was to assess levels of anger, fear, and mistrust in the user’s input to produce information about levels of paranoia the user experiences. “Values for jumps in Fear or Anger for any I-O pair are given in percentages, which are then applied to the difference between the current level and the maximum level,” the authors wrote. An insult, for example, produces the same percentage rise in anger at a low level as at a high one, but the absolute rise will be greater. Here is the “training” PARRY was given about the individual who would start up a conversation (the article is an interesting read; my reading isn’t expert, however, because much technical detail reaches beyond my knowledge base):
“He is a 28-year-old single man who works as a post office clerk. He has no siblings and lives alone, seldom seeing his parents. He is sensitive about his physical appearance, his family, his religion, his education and the topic of sex. His hobbies are movies and horseracing. He has gambled extensively on horses both at the track and through bookies. A few months ago he became involved in a quarrel with a bookie, claiming the bookie did not pay off in a bet. Alarmed and angry, he confronted the bookie with the accusations and physically attacked him. After the quarrel it occurred to him that bookies pay protection to the underworld and that this particular bookie might gain revenge by having him injured or killed by underworld figures. He is eager to tell his story to interested and non-threatening listeners. Thus he cautiously offers hints of the direction in which his problems lie and feels his way along in an interview in an attempt to test the trustworthiness of an interviewer (p.5).”
Modern language models perform lightyears better than ELIZA (1966) and PARRY (1972) through revolutionary advances in neural architecture and training. While ELIZA relied on simple keyword matching and pattern substitution and PARRY employed basic emotional state Input-Output modeling with predetermined responses, today's models use massive transformer architectures trained on trillions of tokens of text. These differences enable them to generate contextually appropriate, nuanced responses that demonstrate understanding of complex topics, maintain coherent conversations across many turns, and produce human-like text that often passes casual Turing tests without the rigid scripting that characterized early chatbots.
Users of simulated language machines who are hesitant to engage with them are best served by reframing their mindset toward language machines as bona fide human-like conversation partners, understanding that simulated conversations today are in the same category as ELIZA and PARRY. Human conversations differ from Human-Machine conversations because machines do not mirror “natural language” in natural settings. I hold a long-standing hypothesis that emotional fears about machines 'taking over' originate from distortions I attribute to the conflation of human with simulated conversations.
Categorical Differences Between Simulated and Human Conversations
Simulated conversations are not human conversations. For one thing, conversational machines recognize linguistic emotional cues, analyze sentiment, and adapt their responses accordingly, but they do not comprehend them. Human users sense an empathetic, human-like presence because the language machine was trained to generate output aligned with tones embedded in the human input. It’s a matching game much like simulated meat products fool taste buds. Machines detect stress or frustration in a user's tone and adjust the approach. Machines detect formal, distanced cues and become professors.
AI systems today are offered on the market in hyper-personalized frames of reference in a fetishistic manner. They leverage real-time data analytics and behavioral insights to deliver highly idiosyncratic output, tailoring responses based on user preferences, history, and emotional content. ELIZA was limited to expecting a paranoid user filled with anger and mistrust; ChatGPT 4.5 makes no such assumption but “learns” accumulatively about the user through the user’s language choices.
Human beings do not usually hold conversations with an interlocutor who hangs on their every word. In fact, as Levinson and Torreira (2015) wrote, a human speaker taking a turn in a conversation requires 600-1500ms processing time to complete their turn. Transitioning to the next human speaker’s turn typically occurs with gaps of only 100-300ms. Clearly, the listener is formulating a response while listening, introducing shadings to the incoming message before the message has finished. This is human behavior, nor bot activity. Turn transition begins immediately at input in throw blink of an eye.
Unlike human dialogue, which unfolds organically through mutual intentional exchange, bot “conversations” require explicit "prompting" with contextualizing and crystal clear intentions nothing like conversation. Machine dialogue is more like formal and rule-governed discourse (e.g., testifying under oath, where freedom of speech is dramatically constrained). When speaking with another person, we don't approach the exchange with predetermined expectations of unified, predictable responses; we don’t assume that what initially comes out of our mouth is going to predetermine the conversation from that point forward. Rather, human conversations embrace ambiguity, context-sensitivity, and the natural unpredictability of authentic exchange.
Language machines, by contrast, function by way of structured inputs designed to elicit specific outputs, rendering “conversation” a weak metaphor that obscures more than it clarifies. Instructing a language machine to provide desired linguistic output is truly like systematic, direct instruction for humans. It leaves little to chance. Directing these machines is like inputting a destination address into a GPS device. If the objective is to get to a specific location, the input must be appropriate. “Driving directions to that place we visited last summer in San Diego” might work with a human, but GPS needs more objective data.
Messy Examples
Take a conversation a niece might have with her well-to-do aunt, who happens to be divorced, about getting a first mortgage on a first house. The following dialogue comes from my very own brain, an original conversation created as fiction by a human. Cecelia has her first good job since college and is considering buying a condo and getting a mortgage. My response…
“Oh, how wonderful!” exclaims Aunt Claire. “The smartest thing you could do is begin building equity in a home.”
“So you don’t think it’s crazy?” responds Cecelia.
“Not at all! I remember how nerve wracking it was when I bought my first home,” says Claire.
“How so?” asks Cecelia.
“It wouldn’t have been nearly so anxiety producing if I hadn’t been married to Chuck.”
“I didn’t know him,” says Cecelia.
“If you’re going to buy a house, don’t get married!” says Claire.
Here’s a bot simulation.
First CLAUDE 3.7 Output:
"I wanted to talk to you about something important, Aunt Claire," says niece Cecelia.
I look up from my knitting with concern, setting it aside on the arm of my favorite chair. "Of course, dear. What's on your mind?" I pat the spot next to me on the sofa, inviting her to sit.
Cecelia hesitates in the doorway, fidgeting with the hem of her sweater. Whatever this is, it's clearly weighing on her.
"You can tell me anything," I say gently. "Is everything alright?"
What might have happened if I had prompted “Cecelia is a young professional who wants to own her own home. Her well-to-do Aunt Claire has just been diagnosed with cancer. Cecelia flies from Cincinnati to Laramie, Wyoming, intent on getting Claire to list her as the beneficiary of her house in the Trust”? Spoken to a human, the response might be, “My gosh, what a horrible situation,” my friend says. “Our society has backed young people into an economic corner.”
Here’s a language machine:
CLAUDE 3.7: I understand this is a creative scenario about Cecelia, a young professional from Cincinnati, traveling to visit her Aunt Claire in Laramie, Wyoming. Aunt Claire has recently been diagnosed with cancer, and it seems Cecelia's intention is to convince her aunt to list her as a beneficiary for Claire's house in her Trust.
This setup suggests a potentially complex family situation involving estate planning during a difficult time. The scenario raises questions about Cecelia's motivations and the relationship dynamics between her and her aunt.
Would you like me to:
Continue developing this narrative as a creative writing exercise?
Explore the ethical considerations in this kind of situation?
Discuss practical aspects of estate planning and trusts?
I can take this scenario in several directions based on what you'd find most helpful.
Comparing the first and second simulations, both outputs present a scenario involving Cecelia, a young professional, and her Aunt Claire. Both mention Cecelia's desire for property ownership in some form. Both position the machine to respond to an incomplete narrative that could develop in different directions. However, details shape the output in a tentative direction in the first simulation.
In the second simulation, Cecelia is from Cincinnati visiting Laramie, Wyoming, and Aunt Claire has been diagnosed with cancer. Relationship dynamics are very different. This simulation suggests more complex and potentially ethically ambiguous motivations, nothing solid enough for the machine to send its signal through its algorithm to produce a probabilistically desired response. Before it can do its work, the bot seeks clarification.
Why? Look at the prompt. In the first output, Cecelia speaks to Claire. Predictably, Claire must respond and therefore is taken up as the consciousness the machine enters: “I look up from my knitting.” The machine “infers” that the user is interested in Claire’s thoughts. Let’s change the prompt to see if we can manipulate the machine to approach the task differently: “Cecelia, a young woman who has gotten her first good job since college, is considering taking out a mortgage and buying a condominium. Aunt Claire has been diagnosed with cancer. List her as the beneficiary of her house in the Trust.. Visiting her Aunt Claire, she asks: ‘I wanted to talk to you about something important.’”
CLAUDE 3.7: I can see that you're about to start a dialogue between Cecelia and her Aunt Claire about an important topic - Cecelia considering buying a condominium and taking out a mortgage for her first home purchase after getting a good job. This appears to be a creative scenario where Cecelia is seeking advice from her aunt.
Would you like me to continue this dialogue, perhaps showing how Aunt Claire might respond to Cecelia? Or would you prefer me to provide information about first-time home buying, mortgages, and considerations for someone in Cecelia's position?
At the core of effective machine exchanges lies the human ability to craft precise, contextually detailed machine prompts—a deep skill that draws upon existing linguistic and rhetorical traditions while demanding new pedagogical instincts. These machines will do something no matter which prompt one inputs, but the usefulness and reliability of the output is crucially dependent on prompt quality—nothing like human conversation.
Prompting for Fun and Profit
The machine can’t discern a newcomer to prompting from an expert unless it is prompted and responds to whatever is in the seed prompt, irrespective of the individual who typed it in. The question is this: How aware is the user of the wheels the prompt sets in motion? Inexpert prompters may try the first prompt that comes to mind, similar to what we do in ordinary human conversation. “Is it going to rain?” a person asks a neighbor. “Could,” says the divorced woman who actually has expert knowledge of the intersection between mortgages and divorce settlements. “Hope not." Hey, you got a minute?” the person asks. “Maybe. What do you need?” The person drops a handful of weeds in the garbage. “I’m thinking about taking out a second mortgage.”
Deep prompting skills develop over long time periods often characterized by systematic and intentional practice prompting under the rhetorical guidance of an expert user of the machine with collaborative opportunities to engage with peers. Prepackaged prompt examples and routines can be interesting and fruitful, but ironically such prompts are often effective only when the machine user understands what is happening under the surface between the time the machine receives the prompt and the time it outputs, usually a matter of milliseconds. There really is no substitute for human-to-human teaching and learning of deep prompting if the objective is to equip students with the intuitive, predictive, and analytical sociocognitive mental frames for bot prompting constructed through mentorships.
In contrast to a surface skill like adding together a list of numbers and dividing the total by the number of numbers in the list to get an average, over time under expert guidance, the deep skill of calculating and interpreting a mean and standard deviation of a distribution of numbers and comparing the results with those derived from a second distribution of numbers to examine a hypothesis takes shape over time and practice.
In the End
Failing to teach students that bots do not truly converse risks creating a generation that anthropomorphizes AI, mistaking statistical pattern matching for genuine understanding. Students may develop unrealistic expectations about AI capabilities, projecting human qualities like intentionality, consciousness, and emotional understanding onto systems that fundamentally operate through predictive text generation. This misattribution can lead to inappropriate trust in AI systems, vulnerability to manipulation, and diminished appreciation for the unique qualities of human conversation with its natural unpredictability, contextual sensitivity, and mutual intentional exchange.
Conversely, encouraging students to recognize the categorical differences between human conversation and bot interaction fosters critical digital literacy. By understanding that prompting a language model is more akin to programming than conversing, students develop agency in their technological interactions. They learn to navigate AI tools effectively while maintaining appropriate boundaries and expectations. This perspective preserves students' appreciation for genuine human connection while enabling them to harness AI as a tool rather than treating it as a conversational partner—a distinction that will become increasingly crucial as these technologies continue advancing.
Thank you for this very well put together explanation & write up ;)
Been thinking about this as much.
In fact the deeper I go into AI, the more fascinating the studies of the humanities become to me.
Was only reading about the idea of language redundancy” in James Gleick’s book: “The Information” (reading it for the second time now, it’s that mesmerising)
Loved all you had to say here.(Thank you!)
So very important to let the youth know, how the AI has been trained, how it thinks, so as to appreciate the majesty of being a human in this time of machine.
It’s an important time right now for humanity.
Hoping we are all going to choose to listen to voices like yours.