Hi, I'm syoka

journey.py
import asyncio
async def journey():
while True:
await keep_learning()
await take_action()
await deep_thinking()
asyncio.run(journey())

Tech Stack

Spring
Next.js
JetBrains
Docker
OpenAI

Latest Posts

Explore my latest thoughts and insights

Large Model Compression Techniques

Let's first introduce the background of the era of large model compression, starting from GPT3, the magnitude of the weight parameters of the model has gradually risen, and the requirements for hardware have become higher and higher. Taking GPT3 as an example, under FP16 precision, 325G of memory is also required, and if the specification of A100 80G is taken as an example, at least 5 sheets are needed. Therefore, for mobile embedded devices running on limited arithmetic, to ensure that the model performance is acceptable,