Difference between revisions of "AI compute"

From GISAXS
Jump to: navigation, search
(Energy Use)
 
(25 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
+
=Cloud GPU=
==Cloud GPU==
 
 
* [https://lambdalabs.com/ Lambda]
 
* [https://lambdalabs.com/ Lambda]
 
* [https://vast.ai/ Vast AI]
 
* [https://vast.ai/ Vast AI]
Line 7: Line 6:
 
* [https://hpc-ai.com/ HPC-AI]
 
* [https://hpc-ai.com/ HPC-AI]
  
==Acceleration Hardware==
+
=Cloud Training Compute=
* TBD
+
* [https://nebius.ai/ Nebius AI]
 +
* [https://glaive.ai/ Glaive AI]
 +
 
 +
=Cloud LLM Routers & Inference Providers=
 +
* [https://openrouter.ai/ OpenRouter] (open and closed models, no Enterprise tier)
 +
* [https://www.litellm.ai/ LiteLLM] (closed models, Enterprise tier)
 +
* [https://centml.ai/ Cent ML] (open models, Enterprise tier)
 +
* [https://fireworks.ai/ Fireworks AI] (open models, Enterprise tier)
 +
* [https://abacus.ai/ Abacus AI] (open and closed models, Enterprise tier)
 +
* [https://portkey.ai/ Portkey] (open? and closed models, Enterprise tier)
 +
* [https://www.together.ai/ Together AI] (open models, Enterprise tier)
 +
* [https://hyperbolic.xyz/ Hyperbolic AI] (open models, Enterprise tier)
 +
* Huggingface [https://huggingface.co/blog/inference-providers Inference Providers Hub]
 +
 
 +
==Multi-model with Model Selection==
 +
* [https://www.notdiamond.ai/ Not Diamond ¬⋄]
 +
* [https://withmartian.com/ Martian]
 +
 
 +
==Multi-model Web Chat Interfaces==
 +
* [https://simtheory.ai/ SimTheory]
 +
* [https://abacus.ai/ Abacus AI] [https://chatllm.abacus.ai/ ChatLLM]
 +
* [https://poe.com/about Poe]
 +
* [https://gab.ai/ Gab AI]
 +
* [https://www.vectal.ai/login Vectal] ?
 +
* [https://www.blackbox.ai/ BlackboxAI]
 +
 
 +
==Multi-model Web Playground Interfaces==
 +
* [https://www.together.ai/ Together AI]
 +
* [https://hyperbolic.xyz/ Hyperbolic AI]
 +
 
 +
=Local Router=
 +
* [https://ollama.com/ Ollama]
 +
* [https://github.com/mudler/LocalAI LocalAI]
 +
* [https://github.com/AK391/ai-gradio ai-gradio]: unified model interface (based on [https://www.gradio.app/ gradio])
 +
 
 +
=Acceleration Hardware=
 +
* [https://www.nvidia.com/ Nvidia] GPUs
 +
* Google [https://en.wikipedia.org/wiki/Tensor_Processing_Unit TPU]
 +
* [https://www.etched.com/ Etched]: Transformer ASICs
 +
* [https://cerebras.ai/ Cerebras]
 +
* [https://www.untether.ai/ Untether AI]
 +
* [https://www.graphcore.ai/ Graphcore]
 +
* [https://sambanova.ai/ SambaNova Systems]
 +
* [https://groq.com/ Groq]
 +
* Tesla [https://en.wikipedia.org/wiki/Tesla_Dojo Dojo]
 +
* [https://deepsilicon.com/ Deep Silicon]: Combined hardware/software solution for accelerated AI ([https://x.com/sdianahu/status/1833186687369023550 e.g.] ternary math)
 +
 
 +
=Energy Use=
 +
* 2021-04: [https://arxiv.org/abs/2104.10350 Carbon Emissions and Large Neural Network Training]
 +
* 2023-10: [https://arxiv.org/abs/2310.03003 From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference]
 +
* 2024-01: [https://iea.blob.core.windows.net/assets/6b2fd954-2017-408e-bf08-952fdd62118a/Electricity2024-Analysisandforecastto2026.pdf Electricity 2024: Analysis and forecast to 2026]
 +
* 2024-02: [https://www.nature.com/articles/s41598-024-54271-x The carbon emissions of writing and illustrating are lower for AI than for humans]
 +
* 2025-04: [https://andymasley.substack.com/p/a-cheat-sheet-for-conversations-about Why using ChatGPT is not bad for the environment - a cheat sheet]
 +
** A single LLM response uses only ~3 Wh = 11 kJ (~10 Google searches; [https://docs.google.com/document/d/1pDdpPq3MyPdEAoTkho9YABZ0NBEhBH2v4EA98fm3pXQ/edit?usp=sharing examples of 3 Wh energy usage])
 +
** Reading an LLM-generated response (computer running for a few minutes) typically uses more energy than the LLM generation of the text.

Latest revision as of 14:35, 20 June 2025

Cloud GPU

Cloud Training Compute

Cloud LLM Routers & Inference Providers

Multi-model with Model Selection

Multi-model Web Chat Interfaces

Multi-model Web Playground Interfaces

Local Router

Acceleration Hardware

Energy Use