Showing 1 article matching this topic.
An explanation of how parameters are used during LLM inference, and the key differences between total and active parameters in Dense vs Mixture of Experts (MoE) architectures.