Difference between revisions of "AI compute"

From GISAXS
Jump to: navigation, search
(Cloud LLM Routers & Inference Providers)
(Energy Use)
 
(12 intermediate revisions by the same user not shown)
Line 17: Line 17:
 
* [https://abacus.ai/ Abacus AI] (open and closed models, Enterprise tier)
 
* [https://abacus.ai/ Abacus AI] (open and closed models, Enterprise tier)
 
* [https://portkey.ai/ Portkey] (open? and closed models, Enterprise tier)
 
* [https://portkey.ai/ Portkey] (open? and closed models, Enterprise tier)
 +
* [https://www.together.ai/ Together AI] (open models, Enterprise tier)
 +
* [https://hyperbolic.xyz/ Hyperbolic AI] (open models, Enterprise tier)
 
* Huggingface [https://huggingface.co/blog/inference-providers Inference Providers Hub]
 
* Huggingface [https://huggingface.co/blog/inference-providers Inference Providers Hub]
  
Line 25: Line 27:
 
==Multi-model Web Chat Interfaces==
 
==Multi-model Web Chat Interfaces==
 
* [https://simtheory.ai/ SimTheory]
 
* [https://simtheory.ai/ SimTheory]
* [https://abacus.ai/ Abacus AI]
+
* [https://abacus.ai/ Abacus AI] [https://chatllm.abacus.ai/ ChatLLM]
 +
* [https://poe.com/about Poe]
 +
* [https://gab.ai/ Gab AI]
 +
* [https://www.vectal.ai/login Vectal] ?
 +
* [https://www.blackbox.ai/ BlackboxAI]
  
 
==Multi-model Web Playground Interfaces==
 
==Multi-model Web Playground Interfaces==
Line 34: Line 40:
 
* [https://ollama.com/ Ollama]
 
* [https://ollama.com/ Ollama]
 
* [https://github.com/mudler/LocalAI LocalAI]
 
* [https://github.com/mudler/LocalAI LocalAI]
 +
* [https://github.com/AK391/ai-gradio ai-gradio]: unified model interface (based on [https://www.gradio.app/ gradio])
  
 
=Acceleration Hardware=
 
=Acceleration Hardware=
Line 46: Line 53:
 
* Tesla [https://en.wikipedia.org/wiki/Tesla_Dojo Dojo]
 
* Tesla [https://en.wikipedia.org/wiki/Tesla_Dojo Dojo]
 
* [https://deepsilicon.com/ Deep Silicon]: Combined hardware/software solution for accelerated AI ([https://x.com/sdianahu/status/1833186687369023550 e.g.] ternary math)
 
* [https://deepsilicon.com/ Deep Silicon]: Combined hardware/software solution for accelerated AI ([https://x.com/sdianahu/status/1833186687369023550 e.g.] ternary math)
 +
 +
=Energy Use=
 +
* 2021-04: [https://arxiv.org/abs/2104.10350 Carbon Emissions and Large Neural Network Training]
 +
* 2023-10: [https://arxiv.org/abs/2310.03003 From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference]
 +
* 2024-01: [https://iea.blob.core.windows.net/assets/6b2fd954-2017-408e-bf08-952fdd62118a/Electricity2024-Analysisandforecastto2026.pdf Electricity 2024: Analysis and forecast to 2026]
 +
* 2024-02: [https://www.nature.com/articles/s41598-024-54271-x The carbon emissions of writing and illustrating are lower for AI than for humans]
 +
* 2025-04: [https://andymasley.substack.com/p/a-cheat-sheet-for-conversations-about Why using ChatGPT is not bad for the environment - a cheat sheet]
 +
** A single LLM response uses only ~3 Wh = 11 kJ (~10 Google searches; [https://docs.google.com/document/d/1pDdpPq3MyPdEAoTkho9YABZ0NBEhBH2v4EA98fm3pXQ/edit?usp=sharing examples of 3 Wh energy usage])
 +
** Reading an LLM-generated response (computer running for a few minutes) typically uses more energy than the LLM generation of the text.

Latest revision as of 14:35, 20 June 2025

Cloud GPU

Cloud Training Compute

Cloud LLM Routers & Inference Providers

Multi-model with Model Selection

Multi-model Web Chat Interfaces

Multi-model Web Playground Interfaces

Local Router

Acceleration Hardware

Energy Use