Technical Analysis & Trading Education: Build Profitable Trading Skills: AI Paradigms Overview

AI Paradigms Overview

Quick Look Summary

Concept	What it is (1‑sentence)	Core tech / algorithmic family	Typical “sweet‑spot” tasks	Main strengths	Typical limits
Classical AI	Rule‑based systems that manipulate explicit symbols, logic and search.	Expert systems, production rules, planning/search (A*, SAT solvers), knowledge graphs.	Diagnostic reasoning, theorem proving, constraint solving, high‑level robotics planning.	Fully explainable; works with very little data.	Brittle to noise; hard to scale to perception‑heavy domains.
Machine Learning	Algorithms that learn statistical patterns from data without being hand‑programmed.	Linear/logistic regression, decision trees, SVMs, clustering, reinforcement learning, shallow NNs.	Spam detection, churn prediction, recommendation, simple classification/regression.	Good with moderate data; fast to train; relatively interpretable (tree‑based).	Feature engineering still required; performance caps on raw, high‑dim data.
Neural Networks	Computation graphs composed of interconnected “neurons” that approximate functions.	Perceptron, multilayer perceptron (MLP), convolutional layers, recurrent layers, attention heads.	Image/video classification, speech recognition, language modeling (when shallow).	Learns features automatically; smooth function approximation.	Shallow nets struggle with very complex hierarchies; often need lots of data.
Deep Learning	Deep (many‑layer) neural networks that can learn hierarchical representations.	CNNs, RNNs/LSTMs, Transformers, Graph Neural Networks, Diffusion models.	Vision (object detection, segmentation), NLP (translation, chat), speech‑to‑text, game‑playing.	State‑of‑the‑art accuracy on perception tasks; scales with compute & data.	Data‑hungry, opaque (hard to explain), expensive to train/infer.
Generative AI	Models that create new data (text, images, code, music, etc.) rather than just label it.	Autoregressive Transformers (GPT‑x), diffusion models (Stable Diffusion, DALL·E), VAEs, GANs.	Content creation, data augmentation, code synthesis, design, simulation.	Produces novel, high‑fidelity outputs; can be finetuned for many domains.	Hallucinations, lack of factual grounding, bias, copyright concerns.
Agentic AI	An autonomous “agent” that decides, plans, and acts in an environment to achieve goals (often using DL/LLM + tool use).	Reinforcement‑learning agents, LLM‑driven agents (ChatGPT‑plugins, ReAct, AutoGPT), multi‑agent coordination frameworks.	Browsing the web, executing code, orchestrating APIs, robotic control, game‑playing, automated workflows.	Can “reason” over tools and data, perform multi‑step tasks without human micromanagement.	Safety/alignment challenges, unpredictable behavior, high compute cost, requires reliable external tools.

In-Depth Look at Each Paradigm

Idea: Intelligence can be captured by manipulating symbols and logical rules.

Key ingredients: Knowledge bases, ontologies, rule engines, logical inference, search algorithms (A*, Dijkstra), planning (STRIPS, PDDL).
Classic example: MYCIN (1970s medical expert system) that used if‑then rules to diagnose infections.
Strengths: Fully transparent; works with very little data; easy to audit.
Weaknesses: Brittle when faced with noisy or unseen situations; hard to scale to perception‑heavy tasks (vision, speech).

Idea: Let a computer learn a mapping from inputs → outputs by optimizing a loss function on data.

Categories: Supervised, unsupervised, semi‑supervised, reinforcement learning.
Typical algorithms: Logistic regression, decision trees, random forests, SVMs, k‑means, Q‑learning.
When it shines: Structured/tabular data, moderate data volumes, problems where interpretability matters.
Limitations: Requires hand‑crafted features; performance caps on raw high‑dim data (images, raw text).

Idea: A network of simple computational units (neurons) arranged in layers can approximate any continuous function (Universal Approximation Theorem).

Core parts: Input layer → hidden layers (weights + non‑linearities) → output layer.
Common variants: Fully‑connected (MLP), convolutional layers (CNNs) for spatial locality, recurrent layers (RNN/LSTM/GRU) for sequences, attention heads.
Strengths: Learns features automatically; flexible function approximator.
Typical failure mode: Shallow nets struggle with complex hierarchical patterns; they still need a decent amount of data.

Idea: Use many‑layer neural networks to learn hierarchical feature representations automatically.

Why “deep” matters: Early layers capture low‑level patterns (edges, n‑grams); deeper layers capture high‑level concepts (objects, syntax).
Landmark breakthroughs: AlexNet (2012, ImageNet), Transformers (Vaswani et al., 2017), BERT/GPT families (NLP), Diffusion models (image generation).
Typical tasks: Image classification, object detection, speech‑to‑text, language understanding, game‑playing.
Pros: State‑of‑the‑art accuracy on perception tasks; scales with compute & data.
Cons: Data‑hungry, opaque, expensive to train/infer.

Idea: Instead of just labeling, the model learns to sample from the data distribution—producing new, plausible examples.

Two broad families:
- Autoregressive (e.g., GPT‑3, Codex) – predict next token conditioned on previous tokens.
- Latent/denoising (VAEs, diffusion models) – learn a latent space and decode/denoise to synthesize data.
Key applications: Chatbots, code assistants, image generation (Stable Diffusion, DALL·E), music, synthetic data for simulation.
Strengths: Produces novel, high‑fidelity outputs; can be fine‑tuned for many domains.
Risks / limits: Hallucinations, lack of factual grounding, bias, copyright / IP concerns.

Idea: An autonomous “agent” that decides, plans, and acts in an environment to achieve a goal – often built on top of large language models (LLMs) with tool‑use capabilities.

Typical architecture:
1. Goal formulation (LLM or planner).
2. Planning / reasoning (ReAct chain‑of‑thought, explicit planner modules).
3. Tool use (web browsing, code execution, database queries, robot actuators).
4. Feedback loop (observe outcome, adjust plan).
Research threads: Reinforcement‑learning agents (AlphaGo, OpenAI Five), LLM‑driven agents (AutoGPT, BabyAGI, LangChain agents), multi‑agent systems (cooperative/competitive societies).
When to use: Scenarios requiring multi‑step, self‑service workflows – e.g., “plan a trip, book flights, generate an itinerary”, autonomous research assistants, robotic control.
Challenges: Safety & alignment, unpredictable behavior, high compute cost, reliance on stable external tools, need for robust monitoring.

When to Use Which Paradigm?

Dimension	Classical AI	Shallow ML / Traditional ML	Deep Learning	Generative AI	Agentic AI
Primary output	Decision, plan, logical conclusion	Predicted label/value	Predicted label/value (or latent vector)	New data (text, image, code, sound…)	Sequence of actions (including tool calls)
Data requirement	Very little (knowledge engineered)	Moderate (hundreds‑thousands examples)	Large (≥ 10⁴–10⁶ examples)	Large (same as DL)	Large for underlying model + optional environment data
Explainability	High (rules explicit)	Medium (feature importances, tree paths)	Low‑Medium (layerwise visualizations)	Low (latent space opaque)	Low (LLM reasoning is stochastic)
Compute cost (training)	Minimal	Low‑moderate	High (GPU/TPU clusters)	High (large LLMs or diffusion pipelines)	Very high (LLM + tool‑execution environment)
Best use‑cases	Formal reasoning, compliance, low‑data domains	Tabular business analytics, quick prototypes	Vision, speech, raw‑text understanding	Creative content, data augmentation, design	Autonomous assistants, automated workflows, robotic control
Typical failure mode	Inflexibility to unseen situations	Bad feature engineering, over‑fitting	Bias, hallucination, distribution shift	Nonsense generation, unsafe output	Goal‑drift, unsafe tool usage, unpredictable loops

Jan 23, 2026

AI Paradigms Overview

AI Paradigms Overview

Quick Look Summary

In-Depth Look at Each Paradigm

When to Use Which Paradigm?

No comments: