Return to Article Details Optimization of Large Models for Efficient Inference: Algorithm, Compiler, and System Co-Design Download Download PDF