Funny&Amazing Pics&Facts: platform

Показаны сообщения с ярлыком platform. Показать все сообщения

суббота, 8 февраля 2020 г.

Knowledge graph evolution: Platforms that speak your language

Knowledge graphs are among the most important technologies for the 2020s. Here is how they are evolving, with vendors and standard bodies listening, and platforms becoming fluent in many query languages

By George Anadiotis for Big on Data

This may come as a shock if you've first encountered knowledge graphs in Gartner's hype cycles and trends, or in the extensive coverage they are getting lately. But here it is: Knowledge graph technology is about 20 years old. This, however, does not mean it's stagnating -- on the contrary.

Gartner predicted that the application of graph processing and graph databases will grow at 100% annually through 2022 to continuously accelerate data preparation and enable more complex and adaptive data science. Graph database vendors seem to verify this across the board: 2019 was a very good year. Having identified knowledge graphs as a key technology for the 2020s, we take a look at how they are evolving.

The 20-year old hype

First, let's quickly recap those 20 years of history. What we call Knowledge Graphs today has been largely initiated by none other than Tim Berners-Lee in 2001. Berners-Lee, who is also credited as the inventor of the web, published his Semantic Web manifesto in the Scientific American in 2001. The core concepts for Knowledge Graphs have been laid there.

The Semantic Web manifesto was in many ways ahead of its time. Looking back today, we can see some parts of it going strong, while others have faded. Building on a foundation of standards for interoperability, such as Unicode, URIs, and RDF, the core of the vision has always been semantics: instilling meaning in web content.

The Semantic Web technology stack, back in 2001. While its age shows, and some parts have become obsolete, others form the foundations of one of today's most hyped technologies: Knowledge Graphs

The Semantic Web got a bad name for being academic, while some technical choices such as XML did not quite work out. The thing is, however, that crawling and categorizing content on the web is a very hard problem to solve without semantics and metadata. This is why Google adopted the technology in 2010, by acquiring MetaWeb.

In 2012, the term Knowledge Graph was introduced. A very successful rebranding indeed, and that's not all we have Google to thank for. Google employs key people in the domain and is the driving force behind schema.org. Schema.org is the core of Google's knowledge graph. It is, unsurprisingly, a schema.

Knowledge graphs and schemas are foundationally bound. While not all knowledge graphs are as big as Google's, every one of them is based on a schema. Knowledge graph neophytes do not always realize this, but whether it's implicit or explicit, there's always a schema. Which brings us to the point.

Knowledge graphs and graph databases

Knowledge graphs can be stored in any back end, from files to relational databases or document stores. But since they are, well, graphs, it does make sense to store them in a graph database. This greatly facilitates storage and retrieval, as graph databases offer specialized structures, APIs, and query languages tailored for graphs.

In addition, many graph databases today offer a lot more than just a store for data. They come packaged with algorithms for graph analytics, visualization capabilities, machine learning features, and development environments. They have essentially grown from databases to platforms. But there is further nuance here.

Graph databases come in two main flavors, depending on which graph model they support: Property graph and RDF. In general, RDF graph databases emphasize semantics and interoperability, while property graph databases emphasize ease of use and performance.

Graph databases come in 2 main flavors, depending on which graph model they support: labeled property graph (LPG) and RDF. Image: The Year of the Graph

When it comes to knowledge graphs, RDF graph databases are a natural match. It's not impossible to build knowledge graphs on top of property graph databases. Usually, however, this results in having to learn knowledge management fundamentals the hard way, and re-implement relevant features. While lessons don't come for free, building on platforms centered around knowledge management helps.

Property graphs and RDF graphs are not that different conceptually. Having interoperability between them would be both possible and desirable. This is why in March 2019 a W3C workshop on web standardization for graph data took place, as the first step towards standardization in the graph database world.

A key element to bridge the gap is something called RDF* (RDF star). RDF* is a proposal to standardize a modeling construct for RDF graphs, namely the addition of properties to edges. Although this is possible in RDF, there is no standard way of doing it. Standardizing it would not only help interoperability with property graphs but also interoperability among RDF graphs.

From secret handshakes to RDF stars

As Steve Sarsfield, VP of Product in Cambridge Semantics put it, before RDF*, if people wanted to use edge properties in RDF graphs, they had to rely on secret handshakes. This is not ideal, especially considering one of the key advantages of the RDF stack is standardization and interoperability.

In the wake of the W3C initiative, a couple of RDF graph database vendors went ahead and implemented RDF*. Cambridge Semantics is one of them. Its AnzoGraph database supports RDF*, as well as SPARQL*. SPARQL is the standard query language for RDF, and SPARQL* is its extension that works with RDF*.

Cambridge Semantics recently unveiled AnzoGraph DB Version 2, and when discussing the release with Sarsfield, we wondered what their experience from the field has been. Are people asking for RDF*, has it helped adoption? Bridging the gap with property graphs has enabled AnzoGraph to get an implementation of Cypher, the most popular language for querying property graphs, underway.

Sarsfield noted that it's still relatively early days for knowledge graph adoption. As such, many of the organizations that use AnzoGraph tend to have highly skilled people on board. For them, switching between data models and query languages is not much of an issue. For mainstream adoption, however, this is important.

Stardog is another RDF graph database vendor that has implemented RDF*. Mike Grove, Stardog co-founder and VP Engineering, said this has been in the works for a while, and they are very excited about it. Stardog started working on the plumbing as part of the Stardog 7 development effort, and they were very happy to be able to ship the feature.

Regarding its reception, Grove noted that what people wanted was a more user-friendly way to have edge properties: "Neo4j obviously got this right. RDF* does a fantastic job of bringing the same ease of use to semantic graphs." He went on to add that customers are excited, and many are already working on integrating it into their applications.

Technically, RDF* and SPARQL* are not yet standardized. Both have been introduced by Olaf Hartig, a researcher at Linköping University. When inquiring about their status, Hartig noted that while there have been delays, he hopes the standardization process will pick up speed soon.

For knowledge graph platforms, too, GraphQL is a plus

Both Sarsfield and Grove noted that they expect RDF* to boost knowledge graph adoption. Implementation is key, and having early adopters and real-world usage may also catalyze the standardization process. Sarsfield and Grove expressed their support for the process, as well as the need to get the word out.

RDF* can make a difference, but it's not the only thing going on in the knowledge graph world. As knowledge graphs entail several layers and can be a central piece of infrastructure for organizations, graph databases are growing into platforms.

AnzoGraph started as part of the Anzo platform before becoming a product in its own right. Stardog also touts its product as a platform, emphasizing features such as visualization and virtualization built around the graph database core.

GraphQL has benefits, and a new breed of approaches for using it as an access layer for databases is emerging. Image: Nordic APIs

Another RDF graph database vendor, Ontotext, recently announced a new version of its own platform. An interesting feature that Stardog's and Ontotext's platforms share is support for GraphQL. Unfortunately, GraphQL's name does not do it justice. As if there was not enough confusion already regarding graph: GraphQL is not a graph query language.

GraphQL is a replacement for REST APIs. Despite the misnomer, it's very useful, and its popularity among developers is growing. This is why more and more databases are adding support for GraphQL, with names such as MongoDB joining the GraphQL wave. Graph databases are no exception. Stardog has had it since 2017, Ontotext is in the process of adding it.

As Stardog put it, more developers know and are learning GraphQL than all the graph query languages combined. Ontotext on its part put together a rather elaborate post on the use of GraphQL in its platform. Whichever way you approach it, however, GraphQL makes lots of sense for accessing services built around database platforms.

GraphQL plus variants

Stardog reports GraphQL success within its customer base. Grove mentioned that one of the big Silicon Valley tech companies exclusively uses GraphQL to interact with Stardog. Both Grove and Jem Rayfield, Ontotext's Chief Architect, agree that GraphQL can work well in some cases, but by its very design, the expressiveness of GraphQL is quite limited.

Most people who don't know GraphQL assume it's a graph database query language. Most people who know GraphQL wonder how a graph database can be powered by it. This statement comes from Manish Jain, the CEO and founder of Dgraph. Dgraph is a graph database powered by GraphQL -- or something like it.

GraphQL+ is a derivative of GraphQL, developed and used exclusively by DGraph until today. In a 2019 interview with ZDNet, Jain expressed no interest in standardization for GraphQL+. No other vendor we know of has expressed interest in adopting GraphQL+ either. But that's not all there is to GraphQL for graph databases.

Most approaches are about what GraphQL can do for knowledge graphs. But to close the loop with the Semantic Web underpinning of knowledge graphs, here's an idea: What if GraphQL resources were annotated with URIs?

URIs are global identifiers, which can denote concepts from shared vocabularies, such as schema.org or other ontologies. This seems like a natural fit, and one that both Grove and Rayfield agree has potential. There is another working group set up to align RDF and GraphQL, although it does not look like it's moving very fast.

Knowledge graphs in the 2020s: We speak your language

It seems we are moving towards a new status quo. If NoSQL stands for Not Only SQL, we could call this NoSPARQL -- Not Only SPARQL. SPARQL remains the language of choice for taking full advantage of knowledge graph capabilities. It also doubles as an API, its expressiveness is beyond what GraphQL can attain, and SPARQL's federated query and data integration capabilities are unique.

But vendors seem set to meet users where they are, be it GraphQL or any other language. Even SQL. As Stardog's Grove put it: "We've always strived to bring our technology to the users. GraphQL was a step in that plan. Supporting SQL is the next step in that journey, not because SQL is better than GraphQL, but because of what that support enables."

Graph database vendors and standard bodies are listening to the market, and platforms fluent in many query languages are evolving

Getty Images/iStockphoto

SQL enables existing tooling to work on top of graph databases, making them accessible to a wider audience. Stardog is not the first graph database platform to have added an SQL connectivity layer. Cambridge Semantics also offers a connectivity layer for Tableau. More graph databases support SQL, and there is an ongoing standardization effort to add graph extensions to SQL itself.

Eventually, even natural language support could be an option. "No matter how you feel about SQL, SPARQL, GraphQL, or any other query syntax/language, natural language is just better. Why ask someone to learn an esoteric syntax when they can just simply type?" said Grove.

Grove mentioned Stardog will be launching a natural language interface to the knowledge graph. A pipedream? This may not be too far off. There is ongoing research for natural language interfaces for databases. And, to add to this, there are also existing integrations for accessing databases via voice assistants. So, you can see where this is going.

We don't know whether conversational knowledge graphs are something everyone would be comfortable with. What we do know is that more options is a good thing, and exciting times are ahead. Stay tuned as we keep exploring the years of the graph.

https://zd.net/39b3K9Y

воскресенье, 10 ноября 2019 г.

Top 12 Artificial Intelligence Tools & Frameworks you need to know

Artificial Intelligence has facilitated the processing of a large amount of data and its use in the industry. The number of tools and frameworks available to data scientists and developers has increased with the growth of AI and ML. This article on Artificial Intelligence Tools & Frameworks will list out some of these in the following sequence:

Artificial Intelligence Tools & Frameworks

Development of neural networks is a long process which requires a lot of thought behind the architecture and a whole bunch of nuances which actually make up the system.

These nuances can easily end up getting overwhelming and not everything can be easily tracked. Hence, the need for such tools arises, where humans handle the major architectural decisions leaving other optimization tasks to such tools. Imagine an architecture with just 4 possible boolean hyperparameters, testing all possible combinations would take 4! Runs. Retraining the same architecture 24 times is definitely not the best use of time and energy.

Also, most of the newer algorithms contain a whole bunch of hyperparameters. Here’s where new tools come into the picture. These tools not only help develop but also, optimize these networks.

List of AI Tools & Frameworks

From the dawn of mankind, we as a species have always been trying to make things to assist us in day to day tasks. From stone tools to modern day machinery, to tools for making the development of programs to assist us in day to day life. Some of the most important tools and frameworks are:

- Scikit Learn
- TensorFlow
- Theano
- Caffe
- MxNet
- Keras
- PyTorch
- CNTK
- Auto ML
- OpenNN
- H20: Open Source AI Platform
- Google ML Kit

Scikit Learn

Scikit-learn is one of the most well-known ML libraries. It underpins many administered and unsupervised learning calculations. Precedents incorporate direct and calculated relapses, choice trees, bunching, k-implies, etc.

It expands on two essential libraries of Python, NumPy and SciPy.
It includes a lot of calculations for regular AI and data mining assignments, including bunching, relapse and order. Indeed, even undertakings like changing information, feature determination and ensemble techniques can be executed in a couple of lines.
For a fledgeling in ML, Scikit-learn is a more-than-adequate instrument to work with, until you begin actualizing progressively complex calculations.

Tensorflow

On the off chance that you are in the realm of Artificial Intelligence, you have most likely found out about, attempted or executed some type of profound learning calculation. Is it accurate to say that they are essential? Not constantly. Is it accurate to say that they are cool when done right? Truly!

The fascinating thing about Tensorflow is that when you compose a program in Python, you can arrange and keep running on either your CPU or GPU. So you don’t need to compose at the C++ or CUDA level to keep running on GPUs.

It utilizes an arrangement of multi-layered hubs that enables you to rapidly set up, train, and send counterfeit neural systems with huge datasets. This is the thing that enables Google to recognize questions in photographs or comprehend verbally expressed words in its voice-acknowledgment application.

Theano

Theano is wonderfully folded over Keras, an abnormal state neural systems library, that runs nearly in parallel with the Theano library. Keras’ fundamental favorable position is that it is a moderate Python library for profound discovering that can keep running over Theano or TensorFlow.

It was created to make actualizing profound learning models as quick and simple as feasible for innovative work.
It keeps running on Python 2.7 or 3.5 and can consistently execute on GPUs and CPUs.

What sets Theano separated is that it exploits the PC’s GPU. This enables it to make information escalated counts up to multiple times quicker than when kept running on the CPU alone. Theano’s speed makes it particularly profitable for profound learning and other computationally complex undertakings.

Caffe

‘Caffe’ is a profound learning structure made with articulation, speed, and measured quality as a top priority. It is created by the Berkeley Vision and Learning Center (BVLC) and by network donors. Google’s DeepDream depends on Caffe Framework. This structure is a BSD-authorized C++ library with Python Interface.

MxNet

It allows for trading computation time for memory via ‘forgetful backprop’ which can be very useful for recurrent nets on very long sequences.

Built with scalability in mind (fairly easy-to-use support for multi-GPU and multi-machine training).
Lots of cool features, like easily writing custom layers in high-level languages

Unlike almost all other major frameworks, it is not directly governed by a major corporation which is a healthy situation for an opensource, community-developed framework.
TVM support, which will further improve deployment support, and allow running on a whole host of new device types

Keras

If you like the Python-way of doing things, Keras is for you. It is a high-level library for neural networks, using TensorFlow or Theano as its backend.

The majority of practical problems are more like:

picking an architecture suitable for a problem,
for image recognition problems – using weights trained on ImageNet,
configuring a network to optimize the results (a long, iterative process).

In all of these, Keras is a gem. Also, it offers an abstract structure which can be easily converted to other frameworks, if needed (for compatibility, performance or anything).

PyTorch

CNTK

CNTK allows users to easily realize and combine popular model types such as feed-forward DNNs, convolutional nets (CNNs), and recurrent networks (RNNs/LSTMs). It implements stochastic gradient descent (SGD, error backpropagation) learning with automatic differentiation and parallelization across multiple GPUs and servers. CNTK is available for anyone to try out, under an open-source license.

Auto ML

Out of all the tools and libraries listed above, Auto ML is probably one of the strongest and a fairly recent addition to the arsenal of tools available at the disposal of a machine learning engineer.

As described in the introduction, optimizations are of the essence in machine learning tasks. While the benefits reaped out of them are lucrative, success in determining optimal hyperparameters is no easy task. This is especially true in the black box like neural networks wherein determining things that matter becomes more and more difficult as the depth of the network increases.

Thus we enter a new realm of meta, wherein software helps up build software. AutoML is a library which is used by many Machine learning engineers to optimize their models.

Apart from the obvious time saved, this can also be extremely useful for someone who doesn’t have a lot of experience in the field of machine learning and thus lacks the intuition or past experience to make certain hyperparameter changes by themselves.

OpenNN

Jumping from something that is completely beginner friendly to something meant for experienced developers, OpenNN offers an arsenal of advanced analytics.

It features a tool, Neural Designer for advanced analytics which provides graphs and tables to interpret data entries.

H20: Open Source AI Platform

H20 is an open-source deep learning platform. It is an artificial intelligence tool which is business oriented and help them to make a decision from data and enables the user to draw insights. There are two open source versions of it: one is standard H2O and other is paid version Sparkling Water. It can be used for predictive modelling, risk and fraud analysis, insurance analytics, advertising technology, healthcare and customer intelligence.

Google ML Kit

Google ML Kit, Google’s machine learning beta SDK for mobile developers, is designed to enable developers to build personalised features on Android and IOS phones.

The kit allows developers to embed machine learning technologies with app-based APIs running on the device or in the cloud. These include features such as face and text recognition, barcode scanning, image labelling and more.

Developers are also able to build their own TensorFlow Lite models in cases where the built-in APIs may not suit the use case.

With this, we have come to the end of our Artificial Intelligence Tools & Frameworks blog. These were some of the tools that serve as a platform for data scientists and engineers to solve real-life problems which will make the underlying architecture better and more robust.

Tools of AI

AI has developed many tools to solve the most difficult problems in computer science. A few of the most general of these methods are discussed below.

Search and optimization

Many problems in AI can be solved in theory by intelligently searching through many possible solutions:^[177] Reasoning can be reduced to performing a search. For example, logical proof can be viewed as searching for a path that leads from premises to conclusions, where each step is the application of an inference rule.^[178] Planning algorithms search through trees of goals and subgoals, attempting to find a path to a target goal, a process called means-ends analysis.^[179] Robotics algorithms for moving limbs and grasping objects use local searches in configuration space.^[122] Many learning algorithms use search algorithms based on optimization.

Simple exhaustive searches^[180] are rarely sufficient for most real-world problems: the search space (the number of places to search) quickly grows to astronomical numbers. The result is a search that is too slow or never completes. The solution, for many problems, is to use "heuristics" or "rules of thumb" that prioritize choices in favor of those that are more likely to reach a goal and to do so in a shorter number of steps. In some search methodologies heuristics can also serve to entirely eliminate some choices that are unlikely to lead to a goal (called "pruning the search tree"). Heuristics supply the program with a "best guess" for the path on which the solution lies.^[181] Heuristics limit the search for solutions into a smaller sample size.^[123]

A very different kind of search came to prominence in the 1990s, based on the mathematical theory of optimization. For many problems, it is possible to begin the search with some form of a guess and then refine the guess incrementally until no more refinements can be made. These algorithms can be visualized as blind hill climbing: we begin the search at a random point on the landscape, and then, by jumps or steps, we keep moving our guess uphill, until we reach the top. Other optimization algorithms are simulated annealing, beam search and random optimization.^[182]

A particle swarm seeking the global minimum

Evolutionary computation uses a form of optimization search. For example, they may begin with a population of organisms (the guesses) and then allow them to mutate and recombine, selecting only the fittest to survive each generation (refining the guesses). Classic evolutionary algorithms include genetic algorithms, gene expression programming, and genetic programming.^[183] Alternatively, distributed search processes can coordinate via swarm intelligence algorithms. Two popular swarm algorithms used in search are particle swarm optimization (inspired by bird flocking) and ant colony optimization (inspired by ant trails).^[184]^[185]

Logic[edit]

Logic^[186] is used for knowledge representation and problem solving, but it can be applied to other problems as well. For example, the satplan algorithm uses logic for planning^[187] and inductive logic programming is a method for learning.^[188]

Several different forms of logic are used in AI research. Propositional logic^[189] involves truth functions such as "or" and "not". First-order logic^[190] adds quantifiers and predicates, and can express facts about objects, their properties, and their relations with each other. Fuzzy set theory assigns a "degree of truth" (between 0 and 1) to vague statements such as "Alice is old" (or rich, or tall, or hungry) that are too linguistically imprecise to be completely true or false. Fuzzy logic is successfully used in control systems to allow experts to contribute vague rules such as "if you are close to the destination station and moving fast, increase the train's brake pressure"; these vague rules can then be numerically refined within the system. Fuzzy logic fails to scale well in knowledge bases; many AI researchers question the validity of chaining fuzzy-logic inferences.^[e]^[192]^[193]

Default logics, non-monotonic logics and circumscription^[98] are forms of logic designed to help with default reasoning and the qualification problem. Several extensions of logic have been designed to handle specific domains of knowledge, such as: description logics;^[86] situation calculus, event calculus and fluent calculus (for representing events and time);^[87] causal calculus;^[88] belief calculus;^[194] and modal logics.^[89]

Overall, qualitative symbolic logic is brittle and scales poorly in the presence of noise or other uncertainty. Exceptions to rules are numerous, and it is difficult for logical systems to function in the presence of contradictory rules.^[195]^[196]

Expectation-maximization clustering of Old Faithful eruption data starts from a random guess but then successfully converges on an accurate clustering of the two physically distinct modes of eruption.

Probabilistic methods for uncertain reasoning[edit]

Many problems in AI (in reasoning, planning, learning, perception, and robotics) require the agent to operate with incomplete or uncertain information. AI researchers have devised a number of powerful tools to solve these problems using methods from probability theory and economics.^[197]

Bayesian networks^[198] are a very general tool that can be used for various problems: reasoning (using the Bayesian inference algorithm),^[199] learning (using the expectation-maximization algorithm),^[f]^[201] planning (using decision networks)^[202] and perception (using dynamic Bayesian networks).^[203] Probabilistic algorithms can also be used for filtering, prediction, smoothing and finding explanations for streams of data, helping perception systems to analyze processes that occur over time (e.g., hidden Markov models or Kalman filters).^[203] Compared with symbolic logic, formal Bayesian inference is computationally expensive. For inference to be tractable, most observations must be conditionally independent of one another. Complicated graphs with diamonds or other "loops" (undirected cycles) can require a sophisticated method such as Markov chain Monte Carlo, which spreads an ensemble of random walkers throughout the Bayesian network and attempts to converge to an assessment of the conditional probabilities. Bayesian networks are used on Xbox Live to rate and match players; wins and losses are "evidence" of how good a player is^{[citation needed]}. AdSense uses a Bayesian network with over 300 million edges to learn which ads to serve.^[195]

A key concept from the science of economics is "utility": a measure of how valuable something is to an intelligent agent. Precise mathematical tools have been developed that analyze how an agent can make choices and plan, using decision theory, decision analysis,^[204] and information value theory.^[104] These tools include models such as Markov decision processes,^[205] dynamic decision networks,^[203] game theory and mechanism design.^[206]

Classifiers and statistical learning methods[edit]

The simplest AI applications can be divided into two types: classifiers ("if shiny then diamond") and controllers ("if shiny then pick up"). Controllers do, however, also classify conditions before inferring actions, and therefore classification forms a central part of many AI systems. Classifiers are functions that use pattern matching to determine a closest match. They can be tuned according to examples, making them very attractive for use in AI. These examples are known as observations or patterns. In supervised learning, each pattern belongs to a certain predefined class. A class can be seen as a decision that has to be made. All the observations combined with their class labels are known as a data set. When a new observation is received, that observation is classified based on previous experience.^[207]

A classifier can be trained in various ways; there are many statistical and machine learning approaches. The decision tree^[208] is perhaps the most widely used machine learning algorithm.^[209] Other widely used classifiers are the neural network,^[210] k-nearest neighbor algorithm,^[g]^[212] kernel methods such as the support vector machine (SVM),^[h]^[214] Gaussian mixture model,^[215] and the extremely popular naive Bayes classifier.^[i]^[217] Classifier performance depends greatly on the characteristics of the data to be classified, such as the dataset size, distribution of samples across classes, the dimensionality, and the level of noise. Model-based classifiers perform well if the assumed model is an extremely good fit for the actual data. Otherwise, if no matching model is available, and if accuracy (rather than speed or scalability) is the sole concern, conventional wisdom is that discriminative classifiers (especially SVM) tend to be more accurate than model-based classifiers such as "naive Bayes" on most practical data sets.^[218]^[219]

Artificial neural networks[edit]

A neural network is an interconnected group of nodes, akin to the vast network of neurons in the human brain.

Neural networks were inspired by the architecture of neurons in the human brain. A simple "neuron" N accepts input from multiple other neurons, each of which, when activated (or "fired"), cast a weighted "vote" for or against whether neuron N should itself activate. Learning requires an algorithm to adjust these weights based on the training data; one simple algorithm (dubbed "fire together, wire together") is to increase the weight between two connected neurons when the activation of one triggers the successful activation of another. The neural network forms "concepts" that are distributed among a subnetwork of shared^[j] neurons that tend to fire together; a concept meaning "leg" might be coupled with a subnetwork meaning "foot" that includes the sound for "foot". Neurons have a continuous spectrum of activation; in addition, neurons can process inputs in a nonlinear way rather than weighing straightforward votes. Modern neural networks can learn both continuous functions and, surprisingly, digital logical operations. Neural networks' early successes included predicting the stock market and (in 1995) a mostly self-driving car.^[k]^[220] In the 2010s, advances in neural networks using deep learning thrust AI into widespread public consciousness and contributed to an enormous upshift in corporate AI spending; for example, AI-related M&A in 2017 was over 25 times as large as in 2015.^[221]^[222]

The study of non-learning artificial neural networks^[210] began in the decade before the field of AI research was founded, in the work of Walter Pitts and Warren McCullouch. Frank Rosenblatt invented the perceptron, a learning network with a single layer, similar to the old concept of linear regression. Early pioneers also include Alexey Grigorevich Ivakhnenko, Teuvo Kohonen, Stephen Grossberg, Kunihiko Fukushima, Christoph von der Malsburg, David Willshaw, Shun-Ichi Amari, Bernard Widrow, John Hopfield, Eduardo R. Caianiello, and others^{[citation needed]}.

The main categories of networks are acyclic or feedforward neural networks (where the signal passes in only one direction) and recurrent neural networks (which allow feedback and short-term memories of previous input events). Among the most popular feedforward networks are perceptrons, multi-layer perceptrons and radial basis networks.^[223] Neural networks can be applied to the problem of intelligent control (for robotics) or learning, using such techniques as Hebbian learning ("fire together, wire together"), GMDH or competitive learning.^[224]

Today, neural networks are often trained by the backpropagation algorithm, which had been around since 1970 as the reverse mode of automatic differentiation published by Seppo Linnainmaa,^[225]^[226] and was introduced to neural networks by Paul Werbos.^[227]^[228]^[229]

Hierarchical temporal memory is an approach that models some of the structural and algorithmic properties of the neocortex.^[230]

To summarize, most neural networks use some form of gradient descent on a hand-created neural topology. However, some research groups, such as Uber, argue that simple neuroevolution to mutate new neural network topologies and weights may be competitive with sophisticated gradient descent approaches^{[citation needed]}. One advantage of neuroevolution is that it may be less prone to get caught in "dead ends".^[231]

Deep feedforward neural networks[edit]

Deep learning is any artificial neural network that can learn a long chain of causal links^{[dubious – discuss]}. For example, a feedforward network with six hidden layers can learn a seven-link causal chain (six hidden layers + output layer) and has a "credit assignment path" (CAP) depth of seven^{[citation needed]}. Many deep learning systems need to be able to learn chains ten or more causal links in length.^[232] Deep learning has transformed many important subfields of artificial intelligence^[why?], including computer vision, speech recognition, natural language processing and others.^[233]^[234]^[232]

According to one overview,^[235] the expression "Deep Learning" was introduced to the machine learning community by Rina Dechter in 1986^[236] and gained traction after Igor Aizenberg and colleagues introduced it to artificial neural networks in 2000.^[237] The first functional Deep Learning networks were published by Alexey Grigorevich Ivakhnenko and V. G. Lapa in 1965.^[238]^{[page needed]} These networks are trained one layer at a time. Ivakhnenko's 1971 paper^[239] describes the learning of a deep feedforward multilayer perceptron with eight layers, already much deeper than many later networks. In 2006, a publication by Geoffrey Hinton and Ruslan Salakhutdinov introduced another way of pre-training many-layered feedforward neural networks (FNNs) one layer at a time, treating each layer in turn as an unsupervised restricted Boltzmann machine, then using supervised backpropagation for fine-tuning.^[240] Similar to shallow artificial neural networks, deep neural networks can model complex non-linear relationships. Over the last few years, advances in both machine learning algorithms and computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer.^[241]

Deep learning often uses convolutional neural networks (CNNs), whose origins can be traced back to the Neocognitron introduced by Kunihiko Fukushima in 1980.^[242] In 1989, Yann LeCun and colleagues applied backpropagation to such an architecture. In the early 2000s, in an industrial application CNNs already processed an estimated 10% to 20% of all the checks written in the US.^[243] Since 2011, fast implementations of CNNs on GPUs have won many visual pattern recognition competitions.^[232]

CNNs with 12 convolutional layers were used in conjunction with reinforcement learning by Deepmind's "AlphaGo Lee", the program that beat a top Go champion in 2016.^[244]

Deep recurrent neural networks[edit]

Early on, deep learning was also applied to sequence learning with recurrent neural networks (RNNs)^[245] which are in theory Turing complete^[246] and can run arbitrary programs to process arbitrary sequences of inputs. The depth of an RNN is unlimited and depends on the length of its input sequence; thus, an RNN is an example of deep learning.^[232] RNNs can be trained by gradient descent^[247]^[248]^[249] but suffer from the vanishing gradient problem.^[233]^[250] In 1992, it was shown that unsupervised pre-training of a stack of recurrent neural networks can speed up subsequent supervised learning of deep sequential problems.^[251]

Numerous researchers now use variants of a deep learning recurrent NN called the long short-term memory (LSTM) network published by Hochreiter & Schmidhuber in 1997.^[252] LSTM is often trained by Connectionist Temporal Classification (CTC).^[253] At Google, Microsoft and Baidu this approach has revolutionised speech recognition.^[254]^[255]^[256] For example, in 2015, Google's speech recognition experienced a dramatic performance jump of 49% through CTC-trained LSTM, which is now available through Google Voice to billions of smartphone users.^[257] Google also used LSTM to improve machine translation,^[258] Language Modeling^[259] and Multilingual Language Processing.^[260] LSTM combined with CNNs also improved automatic image captioning^[261] and a plethora of other applications.

Evaluating progress[edit]

AI, like electricity or the steam engine, is a general purpose technology. There is no consensus on how to characterize which tasks AI tends to excel at.^[262] While projects such as AlphaZero have succeeded in generating their own knowledge from scratch, many other machine learning projects require large training datasets.^[263]^[264] Researcher Andrew Ng has suggested, as a "highly imperfect rule of thumb", that "almost anything a typical human can do with less than one second of mental thought, we can probably now or in the near future automate using AI."^[265] Moravec's paradox suggests that AI lags humans at many tasks that the human brain has specifically evolved to perform well.^[128]

Games provide a well-publicized benchmark for assessing rates of progress. AlphaGo around 2016 brought the era of classical board-game benchmarks to a close. Games of imperfect knowledge provide new challenges to AI in the area of game theory.^[266]^[267] E-sports such as StarCraft continue to provide additional public benchmarks.^[268]^[269] There are many competitions and prizes, such as the Imagenet Challenge, to promote research in artificial intelligence. The most common areas of competition include general machine intelligence, conversational behavior, data-mining, robotic cars, and robot soccer as well as conventional games.^[270]

The "imitation game" (an interpretation of the 1950 Turing test that assesses whether a computer can imitate a human) is nowadays considered too exploitable to be a meaningful benchmark.^[271] A derivative of the Turing test is the Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA). As the name implies, this helps to determine that a user is an actual person and not a computer posing as a human. In contrast to the standard Turing test, CAPTCHA is administered by a machine and targeted to a human as opposed to being administered by a human and targeted to a machine. A computer asks a user to complete a simple test then generates a grade for that test. Computers are unable to solve the problem, so correct solutions are deemed to be the result of a person taking the test. A common type of CAPTCHA is the test that requires the typing of distorted letters, numbers or symbols that appear in an image undecipherable by a computer.^[272]

Proposed "universal intelligence" tests aim to compare how well machines, humans, and even non-human animals perform on problem sets that are generic as possible. At an extreme, the test suite can contain every possible problem, weighted by Kolmogorov complexity; unfortunately, these problem sets tend to be dominated by impoverished pattern-matching exercises where a tuned AI can easily exceed human performance levels.^[273]^[274]

суббота, 8 февраля 2020 г.