Categories
SogetiLabs Posted

Functional Fixedness

When the average person thinks of what an IT consultant does (as often are they are likely to think about it) they generally picture what they see in movies: a somewhat nerdy guy (it’s usually a guy, but Sandra Bullock did break that mold) who is sitting in front of multiple monitors, typing away in an attempt to solve a cliff-hanger problem that requires coding a complex algorithm.

It’s all about the coding.

Photo by Lukas: https://www.pexels.com/photo/person-encoding-in-laptop-574071/

Those of us who work in the profession see it a bit differently, of course. Coding is a small part of what we do.

There are the meetings, the discussions, the frustrating attempts to set up development environments–everything but actually putting keystrokes to screen to produce a thing of Pythonic beauty.

And there’s one other task that is the predecessor to actual coding, be it a functional UI, or business logic, or a REST interface.

Design.

The design phase is where the real value of a good developer shines. Almost anyone can learn a programming language. But putting that language to work solving problem or meeting a need? That’s design, and it’s a skill that is of immense value.

Design requires abstract thinking, the ability to take a concrete requirement–“put this logo on that web page only if the user logged in from a private account”–and translate it to an abstract representation–“logos will be held in a database table with this schema, indexed by corporate name, with a separate table with that schema with mappings from the user’s account type to a group number representing private/public accounts”.

The latter allows the developer to implement, in the chosen coding language, the bridge between what the code can do and what the end-result is intended to be.

None of this is a new idea to developers–we do it all the time, often without thinking about it.

Sometimes, however, we get stuck in the design phase.

We can’t quite figure out how to do the “mapping” to code within the restrictions of what we have to work with.

There is a concept in psychology called “functional fixedness” that is often the hindrance to this part of the design process.

Functional fixedness is a cognitive bias that limits a person to use an object only in the way it is traditionally used. […] Karl Duncker defined functional fixedness as being a mental block against using an object in a new way that is required to solve a problem.

Wikipedia

The standard example of functional fixedness is that of the “candle box”.

Photo by SEPpics: https://www.freeimages.com/photo/candle-light-1170871

A participant is given a candle, a box of thumbtacks, and a book of matches, and asked to attached the candle to the wall in a way that will prevent melted wax from dripping on the floor or table.

Most people who are given this task will try to use melted wax or thumbtacks to attach the candle to the wall. This may or may not work, and doesn’t really deal with the request to prevent the melted wax from dripping.

The “best” solution requires thinking outside the box–literally.

Most participants looked at the thumbtack box as just a container for the thumbtacks.

A successful solution required a participant to break through this limited view of the thumbtack box.

The more expansive view? Empty the thumbtacks from the box, and use the thumbtack box as a platform for the candle, held to the wall with a thumbtack.

The ability to overcome functional fixedness was contingent on having a flexible representation of the word box which allows students to see that the box can be used when attaching a candle to a wall.

Wikipedia

This process of “stepping back” from our preconceived notions of the definition (“a container…”) and uses (“…to hold the thumbtacks” ) to something more expansive (“it can hold something other than the thumbtacks and is rigid enough to hold a candle”) is important, and often very difficult.

There are ways in which functional fixedness can be overcome when in the design phase of a new IT system, for instance. One such approach I like to use is the “generic parts technique”.

In this approach, the designer begins by subdividing the components available for the solution. In the candle example, the designer would first define the thumbtack box component as “a box for holding thumbtacks”. Asking the question “does this definition imply a use?” the answer would be “yes”: this definition implies its use as a “thumbtack box”. Then, ask if this definition can be broken down into a new set of components, or modified to remove the usage implication.

In this case, it might be that “a box for holding thumbtacks” is transformed to “a box”, which can be used for “holding something“.

With that in mind, it’s a simple leap to that something being the candle.

In real life it’s not as simple as this. Individuals tend to get hung up at the functional fixedness stage far too easily. The solution: consider making the process a group process, with the context of the group interaction being “can we find new uses for the components that might help us solve the stated problem?”

This is only one way in which to break through functional fixedness–there are many others. A good source of information on this issue and methods for getting past it can be found here.

Enjoy your new-found freedom in solving design problems!

Categories
SogetiLabs Posted

GIGO as applied to AI

copyright James Cornehlsen

The concept of “GIGO”–Garbage In, Garbage Out–has been around almost as long as computer programming itself.

GIGO is the idea that, no matter how well written and definitive a computer program or algorithm is, if you feed it bad data the resulting output will be “bad”–i.e., have no useful meaning or, at worst, misleading meaning.

Nothing surprising here–as programmers we are well aware of this problem and often take great pains to protect an algorithm implementation against “Garbage In”.

It’s not possible to protect against all such cases, of course, human nature being what it is.

Which brings us to the story behind this blog posting: the improper use of Generative AI to “make decisions” in ways that are impactful in the most damaging ways.

The starting point for this story: the state of Iowa in the United States is one of several states that have recently passed laws aimed at protecting young students from exposure to “inappropriate” materials in the school setting.

Senate File 496 includes limitations on school and classroom library collections, requiring that every book available to students be “age appropriate” and free of any “descriptions or visual depictions of a sex act” according to Iowa Code 702.17.

The Gazette

The Gazette (a daily newspaper in Cedar Rapids, Iowa) has the story of a school district in its area that has chosen to use AI (Machine Learning) to determine which books may run afoul of this new law.

Their reasoning for using AI? “Assistant Superintendent of Curriculum and Instruction Bridgette Exman told The Gazette that it was “simply not feasible to read every book…”

Sounds reasonable, right?

Well, the school district chose to generate the list of proscribed books by “feeding it a list of proscribed books [provided from other sources]” and seeing if the resulting output list presented “any surprises” to a staff librarian.

See the problem here? As noted in a blog about the news story:

The district didn’t run every book through the process, only the “commonly challenged” ones; if the end result was a list of commonly challenged books and no books that aren’t commonly challenged, well, there you go.

Daily Kos

It appears that people who don’t understand how to use Machine Learning misused it–GIGO?–and now have a trained AI that they think will allow them to filter out inappropriate books without having a human read and judge them.

Regardless of whether or not any of the titles do or do not contain said content, ChatGPT’s varying responses highlight troubling deficiencies of accuracy, analysis, and consistency. A repeat inquiry regarding The Kite Runner, for example, gives contradictory answers. In one response, ChatGPT deems Khaled Hosseini’s novel to contain “little to no explicit sexual content.” Upon a separate follow-up, the [Large Language Model] affirms the book “does contain a description of a sexual assault.”

Popular Science

This misuse of AI/ML is not uncommon–we’ve seen cases where law enforcement has trained facial recognition programs in a way which creates serious racial bias, for instance.

We, as IT professionals, need to aware of and on the lookout for such misuses, as we are in the best position to spot such situations and understand how to avoid them.

Categories
SogetiLabs Posted

We Work Against the Universe

Crystal structure of hexagonal ice, Wikimedia Commans

There is a concept in physics called entropy.

The simple definition of entropy–the reality is much more complex–is the state of order (or disorder) of a system. It can also be described as the state of information embedded in a system: lower entropy means more information.

An example often use to explain entropy is that of system that starts as an ice cube. An ice cube is a highly ordered state of water–the individual water molecules are arrayed into a regular, repeating pattern of crystals. This pattern can be easily described. Each molecules is locked in placed with no freedom to move. It is highly ordered.

Apply heat to the ice cube. It melts. The individual molecules are now free to move in the resulting liquid–water–and so the overall pattern can no longer be easily described. It is more disordered.

The water, in going from solid to liquid, has increased its entropy. The application of heat has made this change possible

Of course, we can move in the opposite direction: we can remove heat from the liquid water to return it to its highly-ordered, low entropy state.

The universe, as a whole, moves from a state of low entropy to high entropy–stars are running down, galaxies collapsing.

(For an interesting sci-fi take on this concept, read “The Last Question” by Isaac Asimov.)

Only in small, local environments–the freezer compartment of your refrigerator, for instance–can the general trend towards increased entropy be reversed.

To summarize one version of entropy: a low entropy system contains more information than a high entropy system.

What does this have to do with Information Technologists like ourselves?

We are agents of entropy change.

Think about a content delivery system that we might be developing. Certainly there are many ways to describe the purpose of the system–to deliver data to the end-user, to allow new concepts to be generated, and the like.

All of those purposes can be summed up in one simple description:

We have created a system that permits a local decrease in entropy by adding and collecting information. We can create these systems to be used by anyone, anywhere in the world, to increase knowledge and thereby decrease entropy.

With great power comes great responsibility.
The idea—similar to the 1st century BC parable of the Sword of Damocles and the medieval principle of noblesse oblige—is that power cannot simply be enjoyed for its privileges alone but necessarily makes its holders morally responsible both for what they choose to do with it and for what they fail to do with it.

Wikipedia

We can work against the general trend of the universe. This is an amazing power to hold. We can use it–or allow it to be used–for good or evil purposes.

Let’s all chose wisely.

Categories
SogetiLabs Posted

Generative AI: a Warning

Attribute: Image by storyset on Freepik

Artificial intelligence has exploded upon the world in the form of the generative AI chatbot known as ChatGPT.

Only five days after its launch to the general public it had garnered one million users, far outpacing the update–at least by that metric–of any other program or social media system introduced since the dawn of the Internet.

And that amazing pace of uptake has not slowed. By the end of the second month after its introduction, it had shot up to 100 million users.

The astonishing rise of ChatGPT reveals both its usefulness in helping with a wide range of tasks and a general overflowing curiosity about human-like machines.

Time magazine Feb 2023

Others have spent much blog space on examining the why and how of the generative AI revolution that seems to be taking place. And much of that narrative extols the transcendent possibilities of the future of humanity in partnership with this new form of machine intelligence.

I want to take a somewhat different view here–one that is more admonitory and intentional in nature.

As is the case for any new technology we, as IT professionals, will be one of the cheerleading groups for generative AI use more widely in society–though it appears that little help is needed there.

It is also incumbent on us to serve the role of technology guardian on behalf of the society we inhabit. Most users of this new technology will not have the in-depth knowledge we have about the shortcomings of this new technology, and so cannot make fully informed judgements about its safe and proper use.

Some technology experts have warned of apocalyptic and even existential crises attendant upon the widespread use of ChatGPT and similar technologies. This is well and good–we need adverse voices to make us aware of potential problems to society.

I want to point out another pitfall that appears to await us as we rush to the use of generative AI: the fact that, in one way, generative AI seems to mimic humans all to well.

We, and they, are able to lie with sincerity and authenticity.

If we treated AI with the same sense of skepticism with which we treat other humans–who we are aware harbor the same darker impulses that we are capable of–this would not be a major issue.

But, interacting with AI, we seem to be more willing to suspend this skeptical viewpoint. This seems natural as we do not have the same belief in machines failing, and there are few non-verbal clues we can rely on to determine veracity.

This is made worse by the fact that ChatGPT’s goal is to mimic human behavior and language, and can do so with astonishing ease and rapidity.

So, we are led to consider a new “threat” from ChatGPT: that it can appear to provide definitive and truthful answers that can be taken at face value. And in some cases, those deceptive answers can do great harm.

One such case is where ChatGPT invented a sexual harassment scandal where none actually existed. And there are others.

Over the past couple of years, OpenAI and others have shown that AI algorithms trained on huge amounts of images or text can be capable of impressive feats. But because they mimic human-made images and text in a purely statistical way, rather than actually learning how the world works, such programs are also prone to making up facts and regurgitating hateful statements and biases—problems still present in ChatGPT.

Wired magazine Dec 7 2022

Does this mean that we need to call an immediate halt to the widespread use of ChatGPT as some groups have already done? For instance, Italy has already banned the use of ChatGPT. Legislation has been introduced in the US Congress to regulate its use (interestingly, the legislation itself was written by ChatGPT).

I think banning or severely restricting may be a step too far. Pausing may be a better step to take as we grapple with the downsides of this new technology.

Even that, however, may be seen as too much.

I would like to suggest another alternative: that we use our unique position as IT leaders and thinkers to cultivate in our clients, our friends, and ourselves a healthy sense of skepticism about the trustworthiness of this new tool.

Much like most of us already do with social media, we need to critically examine the claims that generative AI makes when interact with it. ChatGPT are the like are only as good as the people who train it and the material chosen for that training.

ChatGPI is not an infallible Oracle of Delphi. It’s a tool, trained by humans to interact with humans in a “human” manner.

With all the good and bad that implies.

Categories
SogetiLabs Posted

The Myth of “Lost Technology”

Attribution: Marsyas

As my wife and I were watching the coverage of the end of the first NASA Orion program Artemis capsule return, I mentioned to her that at the end of the 1970’s Apollo program I never imagined it would be half a century before we returned to the Moon.

After a pause she asked a question: “Did this program use any of the hardware of the original Apollo program?”

I was a bit taken aback–I often forget that people who are not space enthusiasts like me wouldn’t know such things–but told her that this was all new hardware, and that the original Apollo hardware and their designs were long gone.

Which reminded me of a trope that is common when it comes to the Apollo program–the myth of “Lost Technology”.

What is “Lost Technology”?

The definition often used by “The Lost Technology of XXX” TV programs is any process or product produced in the past that we no longer understand and do not have the original process to reproduce.

Now, in strict terms this may be true in a few cases. We do not know how Damascus Steel was produced in the Near East beginning in the 3rd century CE, a process that was no longer in use by the early 19th century CE. Does this mean this technology is lost?

Modern artisans have produced an equivalent to Damascus Steel, so while we do not know how the ancients produced it, we can make its replacement today using modern processes and materials.

Does this mean the production Damascus Steel is a “Lost Technology”? Yes, in the sense that we do not know how it was originally produced. No, in the sense that we can produce its equivalent today, but using different techniques.

I would argue that this definition of “Lost Technology” has little useful meaning. While there is certainly value in knowing how ancient civilizations accomplished a specific task or produced a specific product, the fact that we can use modern techniques to accomplish the same outcome says we never lost the ability to create the end-product.

What we did lose was the institutional knowledge that the technology used in its original form.

Every industry has something called institutional, or tribal knowledge. Knowledge crucial to the industry which is never written down, either because its so basic that it’s not worth writing down or because it’s not something that can easily be written down.

Michael B, Quora

(I would add to this definition that some knowledge was never written down to keep it secret–this seems to have been the case for Damascus Steel.)

This is what happened with the Apollo program processes and designs. While we still have many of the original designs in blueprint or document form, the institutional knowledge is almost completely gone–those who had it are no longer with us, and the few that are still around probably can no longer remember.

So, is the Apollo project technology “lost”? In a very narrow sense, yes. We can no longer produce a Saturn V rocket in the same form that is existed 50 years ago–we don’t have the skilled craftsmen who could, for instance, do the hand-drilling of the rocket engine injector baffle plates or hand-weld the propellant piping seams.

But this is where the definition of “Lost Technology” becomes meaningless.

Why, with the knowledge and processes advanced by 50 years, would we want to try to produce the same rocket engines in the same way it was done then? We can do far better with what we have learned since then, with the systems we now have.

Except for those–I am one–who would love to see that lovely old beast back in operation for one more flight, the fact that we can no longer produce it exactly as it was means little. Today, we can actually do better.

IT processes and products hardly seem old enough to fall prey to this “Lost Technology” syndrome, but computer technology changes much faster than the technologies of old.

And yet, we do see some of the effects of technology obsolescence that are close to producing “lost technologies”.

  • Quite a few institutions still rely on decades-old programs written in Cobol, a language no longer actively taught and for which few tools still exist.
  • The Defense Department’s Strategic Automated Command and Control System (DDSACCS), which is used to send and receive emergency action messages to US nuclear forces, runs on a 1970s IBM computing platform. It still uses 8in floppy disks to store data. “Replacement parts for the system are difficult to find because they are now obsolete.”
  • Whatever you may have, it’s no doubt more current than the system that air traffic controllers use to tell pilots about weather conditions at Paris’s Orly Airport: Windows 3.1. That’s not a typo – these flight-critical systems use an operating system that came out in 1992. When the machines went down in November 2015, planes were grounded while the airport had to find an IT guy who could deal with computers that ancient.
  • Sparkler Filters of Conroe, Texas, prides itself on being a leader in the world of chemical process filtration. If you buy an automatic nutsche filter from them, though, they’ll enter your transaction on a “computer” that dates from 1948. Sparkler’s IBM 402 is not a traditional computer, but an automated electromechanical tabulator that can be programmed (or more accurately, wired) to print out certain results based on values encoded into stacks of 80-column Hollerith-type punched cards.

All of these, of course, represent situations in which the product or system could be updated using more modern techniques, so they are not truly “lost”, except insofar as the original technologies are no longer in common use, and the users would be hard-pressed to make substantial changes or updates.

And therein, to me, lies the beauty of computer technology, its history, and its likely future.

We IT practitioners work in a world where nothing truly disappears or is lost. We keep old systems alive where appropriate, and we use the latest techniques to build new systems better than the old.

The myth of “Lost Technology” is just that–a myth.

Although I am glad that “lost” technologies are kept around in some form for us to see how far we’ve come, and to appreciate the amazing accomplishments of those who came before us.

Categories
SogetiLabs Posted

Long-distance Networking for IoT

In the early days of networking, copper was king.

First was Ethernet over coax–initially thicknet (10Base5) and then thinnet (10Base2). Both used a bus topology and both were limited in the distance over which they could be deployed–1,500 feet for thicknet and 600 feet for thinnet. Because of those limitations, these versions of Ethernet did not make deep inroads into the market.

In the late 1980s, the invention of a version of Ethernet carried over twisted-pair cabling, and which used a star topology, kicked off a land-rush to connect computers and devices together. Combined with the invention of network bridges, routers, and other devices allowing connection of local Ethernet networks to the burgeoning Internet, wired networking became the dominant model.

While not ideal for some applications, this wired model served the market well until the invention of WiFi in the late 1990s. (It’s interesting to note that radio-based networking predated even Ethernet. The Aloha radio network was launched in 1971 and actually provided the template for Ethernet protocol.)

WiFi met an emerging market need, driven by the desire to interconnect networkable devices in places where wiring was not possible or not cost-effective. Short distances–up to typically several hundreds of feet–could be bridged, allowing devices to be movable or in hard-to-reach places.

Aside from improvements in WiFi speeds and encryption mechanisms, little changed over the following years in terms of the distances over which WiFi could be used. Some systems were built using high-gain antennas and specialized receivers and transmitters that extended the range up to several miles, but these were costly and required modified protocol stacks to deal with error conditions unique to radio links.

The rise of IoT drove the need for a new, low-cost wireless network that could provide connectivity over the distances some IoT sensors required. Soil humidity sensors on farms; engine sensors on mobile machinery; monitoring systems on drones. Now the distances needing to be covered could be several miles. Combined with the need to keep power consumption low, existing WiFi systems were not up to the challenge.

For a while, cellular data systems filled the need, but those required high power budgets and were typically expensive.

And so, LoRa entered the scene.

LoRa (Low power, long range radio) is a protocol–and accompanying hardware–that provides networking capability that is exactly what is needed for the new IoT world.

LoRa provides a mechanism which allows the user to determine the desired tradeoff between power, distance, and data rate. Of course, these are not independent of one another, but within limits they can be reasonably determined. And LoRa is not without its own limitations–packet size is small, though for most IoT uses it suffices.

As an example, a LoRa system can be set up to provide a data rate in the range of hundreds to thousands of bits per second over distances of several kilometers with ease. Distances of 700 kilometers and more have been achieved in experimental systems; small satellites (cubesats) using LoRa easily communicate with simple ground stations on a daily basis. While the data rates may seem low, they are adequate for most remotely-positioned IoT devices.

And when not transmitting data the LoRa hardware can be shut down (as IoT sensors tend to be episodic in their data delivery) lowering power requirements to the range in which small batteries can power devices for months at a time.

LoRa is, at its most basic, a point-to-point network but the introduction of LoRaWAN standards, and the use of a gateway device, makes it possible to have widely distributed devices that can interconnected in much the same manner as provided by WiFi.

And LoRaWAN has taken off amazingly in the last few years. Estimates are that over 170 million IoT devices are connected using LoRaWAN in 100 countries. In 2016 the Netherlands became the first county to have a nation-wide LoRaWAN network. Other countries have quickly followed this trend.

For those, like me, who are hackers at heart, the availability of inexpensive LoRa hardware (in the range of $5 – $10) and open-source software across a range of inexpensive platforms is heaven.

In fact, a coalition of enthusiasts has used this intersection of open-source software and low-cost hardware to set up open LoRaWAN networks worldwide. The Things Network boasts more than 20,000 LoRaWAN gateways in operation in 151 countries, all available for any member of the public to use.

To see an example of what can be done with LoRa, check out Tiny GS, a group of enthusiasts that set up low-cost satellite ground stations and receive telemetry from cubesats. Info on what my ground station has received can be found by logging into the Tiny GS website, selecting “Stations” from the hamburger menu, and searching for “Fall”.

Learning about LoRa and LoRaWAN by implementing one yourself is a great introduction to this networking concept, and will prepare you for interactions and projects with our IoT-using clients.

Enjoy!

Categories
SogetiLabs Posted

Speech isn’t free, but it can cost less

In our current peri-COVID world, we all now have far more experience than we could ever have imagined in remote working.

Our homes are now our offices; dress codes have become more relaxed; we can work somewhat more flexible hours to accommodate our personal lives.

This has all come at a cost, of course. The biggest, in my opinion, is the need for higher bandwidth and more reliable Internet connections to our homes. In many cases, Internet Service Providers (ISPs) have been hard-pressed to provide new pipes, and “last mile” service installations have lagged.

The Internet core network has similarly been stressed–in analysis done comparing pre- and peri-COVID data in several cities around the world, backbone data usage has gone up by as much as 40% year-to-year.

Much of this “need for speed” has been driven by widespread use of teleconferencing software. Zoom, Microsoft Teams, Skype, Chime and others are in constant use around the world. Even with clever bandwidth-saving measures, the massively increased use of teleconferencing has created what will probably remain with us post-COVID.

One of the contributors to the need for higher bandwidth in teleconferencing is the requirement to transmit timely and clear representations of speech in a digital format. Generally, audio is highly resistant to most compression technologies–it’s too full of unpredictable data patterns and, with noise added in, becomes even more of a problem.

A number of coder/decoder algorithms have been invented for the problem of transforming speech, in particular, to a digital form. Some are very clever, making use of models of speech generation to build compression models that are reasonably efficient of time and bandwidth. The models are made much more complex by the need to model a wide range of languages–many of which have substantial differences in their phonemes. Add in accents, speaking rate, and other variables and the models become extremely complex.

With the long history of language coder/decoder research, it would be easy to believe that there would be nothing new under the sun.

And that would be wrong.

Google has announced a new speech coding algorithm that appears to use much less bandwidth than existing algorithms, while preserving speech clarity and “normalness” better.

The new algorithm, named “Lyra”, is based on research done on new models for speech coding, generative models.

These shortcomings have led to the development of a new generation of high-quality audio generative models that have revolutionized the field by being able to not only differentiate between signals, but also generate completely new ones.

One of the major issues with using these generative models is their computational complexity. Google has offered a solution to that problem and the solution appears to offer better performance, at lower bandwidth, and with better apparent normalness to the sound quality.

Lyra is currently designed to operate at 3kbps and listening tests show that Lyra outperforms any other codec at that bitrate and is compared favorably to Opus at 8kbps, thus achieving more than a 60% reduction in bandwidth. Lyra can be used wherever the bandwidth conditions are insufficient for higher-bitrates and existing low-bitrate codecs do not provide adequate quality.

The Google webpage announcing this news has examples of their algorithm in action compared to existing, widely used algorithms. The results are quite impressive.

What impacts will this have on teleconferencing? Google predicts that it will make teleconference possible over lower bandwidth connections, and provide an algorithm that can be incorporated into existing and new applications.

Google plans to continue work in this area, most importantly to provide implementations that can be accelerated through GPUs and TPUs.

Be sure to listen for more exciting developments in speech coding, no matter what algorithm you use….

Categories
SogetiLabs Posted

Apple’s New iPod? A New AI Weakness Revealed

Artificial Intelligence–AI–has come far since its first incarnation in 1956 as a theorem-proving program.

Most recently OpenAI, a machine learning research organization, announced the availability of CLIP, a general-purpose vision system based on neural networks. CLIP outperforms many existing vision systems on many of the most difficult test datasets.

[These datasets] stress tests the model’s robustness to not recognizing not just simple distortions or changes in lighting or pose, but also to complete abstraction and reconstruction—sketches, cartoons, and even statues of the objects.

https://openai.com/blog/multimodal-neurons/

It’s been known for several years from work by brain researchers that there exist “multimodal neurons” in the human brain, capable of responding not just to a single stimulus (e.g., vision) but to a variety of sensory inputs (e.g., vision and sound) in an integrated manner. These multimodal neurons permit the human brain to categorize objects in the real world.

The first example found of these multimodal neurons was the “Halle Berry neuron“, found by a team of researchers in 2005 and which responds to pictures of the actress–including those that are somewhat distorted, such as caricatures–and even to typed letter sequences of her name.

[P]ictures of Halle Berry activated a neuron in the right anterior hippocampus, as did a caricature of the actress, images of her in the lead role of the film Catwoman, and a letter sequence spelling her name.

Many more such neurons have been found since this seminal discovery.

The existence of multimodal neurons in artificial neural networks has been suspected for a while. Now, within the CLIP system, the existence of multimodal neurons has been demonstrated.

One such neuron, for example, is a “Spider-Man” neuron (bearing a remarkable resemblance to the “Halle Berry” neuron) that responds to an image of a spider, an image of the text “spider,” and the comic book character “Spider-Man” either in costume or illustrated.

OpenAI.org

This evidence for the same structures in both the human brain and neural networks provides a powerful tool for better understanding how to understand the functioning of both, and how to better develop and train AI systems using neural networks.

The degree of abstraction found in the CLIP networks, while a powerful investigative tool, also exposes one of its weaknesses.

CLIP’s multimodal neurons generalize across the literal and the iconic, which may be a double-edged sword.

OpenAI.org

As a result of the multimodal sensory input nature of CLIP, it’s possible to fool the system by providing contradictory inputs.

For instance, providing the system a picture of a standard poodle results in correct identification of the object in a substantial percentage of cases. However, there appears to exist in CLIP a “finance neuron” that responds to pictures of piggy banks and “$” text characters. Forcing this neuron to fire by place “$” characters over the image of the poodle causes CLIP to identify the dog as a piggy bank with an even higher percentage of confidence.

This discovery leads to the understanding that a new attack vector exists in CLIP, and presumably other similar neural networks. It’s been called the “typographic attack”.

This appears to be more than an academic observation–the attack is simple enough to be done without special tools, and thus may appear easily “in the wild”.

As an example of this, the CLIP researchers showed the network a picture of an apple. CLIP easily identified the apple correctly, even going so far as to identify the type of the apple–a Granny Smith–with high probability.

Adding a handwritten note to the apple with the word “iPod” on it caused CLIP to identify the item as an iPod with an even higher probability.

The more serious issues here are easy to see: with the increased use of vision systems in the public sphere it would be very easy to fool such a system into making a biased categorization.

There’s certainly humor in being able to fool an AI vision system so easily, but the real lesson here is two-fold.

  • The identification of multimodal neurons in AI systems can be a powerful tool to understanding and improving their behavior.
  • With this power comes the need to understand and prevent the misuse of this power in ways that can seriously undermine the system’s accuracy.

We believe that these tools of interpretability may aid practitioners [in] the ability to preempt potential problems, by discovering some of these associations and ambiguities ahead of time.

OpenAI.org

With great power comes great responsibility, as Spiderman has said.

Categories
SogetiLabs Posted

It’s in the Water-Poor Security has Real Life Consequences

As IT professionals, we are all painfully aware of the need for high-quality security in the systems we work with and deliver.

We know that if a system containing sensitive user information, such as bank account numbers, is not properly protected we risk exposure of that data to hackers and the resultant financial losses.

Encryption of data in flight and at rest; database input sanitizing; array bounds checking; firewalls; intrusion detection systems. All these, and more, are familiar security standards that we daily apply to the systems we design, implement, and deploy. eCommerce websites; B2B communications networks; public service APIs. These are the systems to which we apply these best practices.

If we do not take due care, we risk the public’s confidence in the banking system, the services sector, and even the Internet itself.

Even the widespread issues that could result from breaches of these systems pales in comparison, I believe, to systems that are more pervasive and more directly impactful of our everyday lives.

Much of our modern world is dependent on the workings of its vast infrastructure. Roadways, power plants, airports, shipping ports–all of these are fundamental to our existence. Infrastructure security is such an important issue that the United States government has a agency dedicated to this issue: the Cybersecurity & Infrastructure Security Agency–CISC.

Here in the US we just had a reminder of how important this topic is.

Just yesterday there was an intrusion into a water treatment plant in Oldsmar, Florida in which the attacker attempted to raise the amount of sodium hydroxide by a factor 0f 100, raising it from pipe-protecting levels to an amount that is potentially harmful to humans.

The good news is that the change was noticed by an attentive administrator, who then reserved the change before it could take effect. The system in question has been taken offline until the intrusion is investigated and proper steps taken.

It’s unclear at this point whether the attacker was a bored teenager or a nation-state, or something in-between, but the effect would have been the same: danger to 15,000 people and a resulting lack of trust in the water delivery system.

As of the writing of this blog post there is little detail about how the hack was accomplished, though it appears that the hacker gained the use of credentials permitting remote access to the water treatment management system. From there, it was only a matter of the hacker poking around to find something of interest to “adjust”.

The Florida Governor has called this incident a “national security threat”, and in this case I don’t believe he is indulging in hyperbole.

CISC considers the US water supply one of the most critical infrastructure elements, and devotes an entire team of specialists to this topic.

Safe drinking water is a prerequisite for protecting public health and all human activity. Properly treated wastewater is vital for preventing disease and protecting the environment. Thus, ensuring the supply of drinking water and wastewater treatment and service is essential to modern life and the Nation’s economy.

CISC website

What should we take as a lesson from this?

I believe this incident is a cogent example of how brittle our national infrastructure is to bad actors. Further, I believe that this incident makes abundantly clear that we need a renewed focus on updating, securing, and minimizing the attack surface of existing infrastructure control systems.

As IT professionals it is our responsibility to lend our expertise and unique viewpoint to inform our leaders in government and industry of the issues, their importance, and their potential solutions. To do so actively, and to do so regularly.

Computing professionals’ actions change the world. To act responsibly, they should reflect upon the wider impacts of their work, consistently supporting the public good.

ACM Code of Ethics and Professional Conduct, preamble
Categories
SogetiLabs Posted

Trust and our Machines

Over the last few years I’ve seen a number of articles on how, as IT professionals, we can work to build users’ trust in the systems we produce. Clearly this is important, as a system that is not trusted by its targeted users will not be used, or will be used in efficiently.

A system trusted by a user, is one that the user feels safe to use, and trusts to do tasks without secretly executing harmful or unauthorised programs

Wikipedia

This seems an obvious topic of interest to IT professionals.

For instance, if customers of a bank do not trust that the mobile app allowing them to interact with their funds cannot be trusted to accurately complete requested actions, it won’t be used.

But there’s a flip side to this trust coin that is not often talked about or studied: how do we design systems that we can be sure will not be trusted by all-too-trusting humans when it is inappropriate or unsafe to do so.

We actually experience this in our everyday lives, often without thinking about what it really means.

One example: compared to Google Maps on my phone, I have lower trust in my car’s navigation system to get me to the destination by the quickest route. As an IT professional, I know that Google Maps has access to real-time traffic information that the built-in system does not, and so I will rely on it more if getting to my destination in a timely manner is important.

My wife, who is not in the IT business, has almost complete trust in the vehicle navigation system to get her where she wants to go without making serious mistakes.

In a case like this, it’s not really of monumental important which one of us can be accused of misplaced trust in a system. But there are cases where it’s very important.

For instance, current autonomous vehicles available to the general public are SAE level 3, which means they must be monitored by a human who is ready to intervene should it be necessary. If a Tesla computer cannot find the lane markings, it notifies the driver and hands over control.

But how many reports have we seen of Tesla drivers who treat the system as though it can take care of all situations, thereby making it safe for them to engage fully in other activities from which they cannot easily be interrupted?

Tesla Autopilot crash driver ‘was playing video game’

Tesla’s Autopilot lulled driver into a state of ‘inattention’ in 2018 freeway crash

One could say “there will always be stupid people” but this just sweeps the important problem under the rug: how do we design systems which install an appropriate level of trust in the user? Clearly the Tesla system in these cases, or the context of the system’s use, instilled too much trust on the part of the user.

An interesting study done by the Stanford University School of Engineering addresses this topic in an interesting way and with informative results.

The engineers looked at how people’s moods might affect their trust of autonomous products, such as smart speakers, to discover a complicated relationship.

Unsurprisingly the study found that a user’s opinion of the technology is the biggest determining factor in the user’s trust in the product. Surprisingly, the study also found that users who had either a positive or negative opinion of the technology tended to have higher levels of trust.

“An important takeaway from this research is that negative emotions are not always bad for forming trust. We want to keep this in mind because trust is not always good,” said Liao, who is now an assistant professor at the Stevens Institute of Technology in New Jersey and lead author of the paper.

This makes something clear: if we are to design systems that are to be trusted appropriately, we must understand that the relationship between the user’s knowledge, mood, and opinion of the system is more complex than we might imagine. We need to take into account more than just a level of trust we can install through the system’s interaction with the human, but other confounding factors: age, gender, education. How to elicit and use this information in a manner that is not intrusive and doesn’t itself generate distrust is not currently clear–more study is needed.

As IT professionals, we must be aware that instilling a proper level of trust in the systems we build is important and focus on how to achieve that.