The definitive guide to Britain's success in the twenty-first century





Home Politics and Governance economy and business energy and transport education health and welfare Philosophies

About CST


The way forward

25 Year Planning

Marginal Costing


Debt & Economics



Super Fast Track








AI disparity, what’s going on here? – AI does not add up

We all now know that the Chatboxes and language based AI learning systems are profoundly interesting and seem to produce confounding output.    Google has created LaMDA, a chatbot so good that one of Google’s research staff thought it had sentience.  Open AI has ChatGPT and a whole range of systems based on it for both language and other types of output such as pictures.  Meta (Facebook) have created a chatbot 'Cicero' for use within their multi-player game of ‘Diplomacy’.  This game has up to 12 simultaneous players and these players are required to negotiate to find a wining solution to a world war.  During testing, this chatbot not only won more games, but it fooled all the other hundreds of participants, (across many games), into thinking the chatbot was a real person. And it clearly could negotiate effectively to reach winning solutions.

We know that these learning systems are fed with ’lots’ of data and that once trained, nobody can work out how they derive their ‘answers’ to human or other inputs.  They do however take time to train and they do require significant computing resources to operate.


One question immediately comes to mind – can they scale?  It’s all very well creating a learning process that takes many years to work as we would like, but if these AI systems are to be used in the real world by many millions or billions of people or companies, they need to be scaled up to handle this number of users.

If they cannot easily or quickly be scaled, then their use is limited to a few detailed environments, such as research screening of documents, results and other complex data intensive work.  While this is useful and has already shown to great effect in discovering new science (such as Deep Minds AI process for understanding final protein shapes from the DNA chain), there may be currently a block on wider use.

Very little seems to be published on this scalability issue.  There is a new business venture called ‘Modular’ who asserts that: “We saw that fragmentation and technical complexity (for AI platforms) held back the impact to a privileged few”  Their aim is to create a simpler modular AI platform that scales. This suggests that scalability is a very big issue, why is nobody discussing it openly?

Practical Stuff – a Thought Experiment:

The second, and most difficult issue to understand is the lack of any published testing on real life physical, (or virtual), robotic systems.  What we are looking for here is simply taking the output from the AI and inputting it into a functional system that does something in the real world.  This is very simple to achieve, the easiest way is just to use a virtual robotic system.   Alphabet's deep mind has evolved specific AI to integrate with board games and specialist research but these are not generalist chatbots that are trained to communicate seamlessly with humans. There are many implementations of Chatbots for customer data processing systems, but these seem to be simplistic tools within a pre-set communication environment.

The best AI systems can output (seemingly) sensible advice from human spoken or written input.  The ‘Diplomacy’ chatbot we know made good negotiating decisions. 

We can envisage therefore a simple test where some additional training is provided on linking words and phrases to specific functional outputs for the virtual robot.  For our simple test these could be spatial movements of various parts, various sensor measurements, time, accuracy, sequential processes and so on.

By training the AI to understand these specific robotic commands and how they link to everyday words and phrases, the AI should become accurate in interpreting how to 'tell' the robot to function from human written language input.

The New Turing Test - The Commonsense Test:

As all of these chatbot type AI systems use language as a means of ‘understanding’ our input, they would presumably have no difficulty in ‘understanding’ a new range of words and phrases that link directly to the virtual robotic functions.  So when we ask the chatbot to tell us what turning completely around means, then currently the chatbot will tell use that this means turning through 360 degrees or some other description.  So, once re-trained to use the limited but exact virtual robotic functions, if we asked the chatbot to turn its robot completely around, it should have no difficulty in outputting the command of ‘turn through 360 degrees’ or some simpler commands that achieve this exact result.

The virtual robot that has a specific hard-wired response to this input, would indeed turn itself completely around.  We can already achieve this simple, direct level of functionality with fairly simple AI eg ‘Hey Google’ or ‘Alexa’ for controlling connected devices eg light on or off, speaker volume up or volume down. 

But, you ask, why not just tell the robot to ‘turn through 360 degrees’. True, but then we have to talk in robot language that is perfectly aligned to the very limited input that the robot understands. For more complex results we would need to start ‘programming’ our spoken input to get the exact output we required. And humans are still ‘in the loop’

Using our chatbot, we could be very inaccurate with our requirements. Let’s say we said to our chatbot - what does it look like for a robot to do a dance?  The chatbot would expectedly reply with a description of a robot doing some sort of continuous movement.

CST did an experiment using ‘Chatsonic’ (derived from OpenAI ) for a English language question:

CST asked:

What does is look like for a robot to dance the jive to a rhythmic tune?

Chatsonic replied, (no Google data turned on):

“For a robot to dance the jive to a rhythmic tune, the robot would need to be programmed with the moves of the jive. It would also need to be able to recognize and respond to the rhythm of the tune in order to move accordingly. The movements could be done by the robot's motors and servomotors moving its body parts, or it could be done by the robot performing pre-programmed choreography"

and second test:

"A robot dancing the jive to a rhythmic tune is an impressive sight. It requires precise movements and coordination of the robot's parts to move in harmony with the beat. It can be programmed to perform a variety of moves, from basic steps to more complex sequences. The robot's movements must be synchronized with the music in order for it to look like it is actually enjoying the dance."

CST - Interestingly, this suggests that the Chatsonic AI ‘understands’ the issues around the movement of a robot and how it might be programmed to do such a task – But does it understand?

The New Test:

So, our new 'Turing' test simply asks our Chatbot to action the virtual robot and make it dance.  The chatbot would translate its idea of dance into simple, repetitive timed robot movements of its limbs.

This is the crucial question - would it be able to translate its own language output (ie for us), into the simple, but exact instructions to make the virtual robot actually dance?

This is the new Turing test!

Tomorrows Touchstone:

This is the big, fundamental question – one that describes what we are really observing from these chatbots - is there any deeper understanding other than the simplistic shuffling of phrases that are matched to our input? 

We can see that this is a very simple experiment.  Has it been done yet?  And if so, why are the people in Google, Meta, Deep Mind, OpenAI not telling us the outcome?

This experiment could change everything overnight.  If the experiment showed that the chatbot could make the robot dance and perhaps also to a tune with different dances, it would demonstrate accurately that the Chatbot does know the difference between learnt phrases and real actions.  If it can do this simple test, then we are already a long way down the road towards Smart Robotics as these new Chatbots are already significant in their (apparent) understanding of our world and what we really mean.

This new Turing test, now renamed the CS-Test – ‘Commonsense Test’ – fundamentally changes the way we consider current AI – is it just a clever lookup ‘phrase book,’ as Google has suggested in response to the LaMDA sentience issue, or is their some deeper understanding?  It is impossible to derive where sentience starts – many have tried - and CST does not care. 

What we do care about is moving the world forward through the development of the Smart Robot.  And the new CS-Test describes exactly whether the Chatbot/AI actually understands what we ‘mean’ rather than what we say. 

If they pass the CS-Test then this describes the ‘Smart’ in Smart Robotics, and it will do very nicely for the next 50 years or so to provide for our Smart Robotic Revolution…. And the world changes.





Current AI research does not add up

Dec 2022

Why are Chatbots 'seemingly' so good? Are they just presenting us with a clever 'phrase book' that matches our questions?

- Or is something else going on, and if so - why have we not been told what it is?

CST has a new 'Turing test' to establish the 'truth'