'LLM' 태그의 글 목록

250x250

Notice

Recent Posts

Recent Comments

Link

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

목록LLM (5)

운동하는 공대생

갤럭시 S25 에서 LLM 구동하기 (llama.cpp, Termux)

먼저 개발자 모드로 휴대폰 변경 필수이니 미리 변경하고 실행을 해야 한다. 사전 준비1. Termux 설치Termux는 안드로이드에서 리눅스 터미널 환경을 제공하는 앱이으로, 리눅스 명령어 실행, 패키지 설치, 개발 작업 등을 모바일에서 할 수 있다. https://play.google.com/store/apps/details?id=com.termux&hl=ko Termux - Google Play 앱터미널 에뮬레이터 및 리눅스 환경을 제공합니다.play.google.com 실행먼저 Termux를 실행하면 아래와 같이 화면이 나온다.Termux는 unix 기반의 안드로이드 장치에서 리눅스 환경을 제공합니다. 하지만 sudo 같은 관리자 권한은 없고 디바이스에 따라서 다르게 적용됩니다.(최신 사양의 휴..

Deep Learning 2025. 3. 12. 23:12

[Deep Learning]Quantization (양자화)

양자화에 대한 개념은 LLM 모델의 크기가 커짐에 따라서 당연하게 크기를 줄이는 방향으로 연구가 진행되면서 최근에 기본적으로 사용되는 이론이다.양자화에 대한 개념을 말하기 전에 데이터에 대한 표현 방식을 먼저 이야기를 해보자면 integer는3 → 1112 → 1100 4bit integer3 → 001112 → 1100 데이터에 대한 표현은 0과 1로 이루어져 있기 때문에 비트에 대한 제한을 준다면 4비트 int 같은 경우에는 0에서 15까지의 표현이 가능하다. 실제 우리가 사용되는 숫자의 표현은 int만이 사용되지 않고 실수를 많이 사용한다. 그렇다는 건 이것을 표현하기 위해서는 float 표현을 통해서 우리가 아는 실수를 모두 컴퓨터에서 표시를 한다.float는 32, 16,8 등으로 표현이 가능하..

Deep Learning 2025. 1. 2. 13:16

[논문] Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve

논문https://arxiv.org/abs/2403.02310 Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-ServeEach LLM serving request goes through two phases. The first is prefill which processes the entire input prompt and produces the first output token and the second is decode which generates the rest of output tokens, one-at-a-time. Prefill iterations have hiarxiv.org1. Introduction본 논문에서는 기존의 L..

논문 2024. 8. 5. 18:37

[논문]LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS

논문https://arxiv.org/abs/2106.09685 LoRA: Low-Rank Adaptation of Large Language ModelsAn important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes learxiv.org 1. Introduction Language 모델을 활용하는 다양한 분야에서 특정 ..

논문 2024. 5. 17. 00:25

[Transformers] Auto GPT - 실습

Intro 요즘 GPT, LLMA, Dolly 등 다양하게 LLM 모델의 열풍이 불고 있어서 이번에 실습으로 Auto-GPT라는 것을 한번 해보았다. 참고 영상 https://www.youtube.com/watch?v=YbLef4CrZNU&t=593s 소스 코드 https://github.com/Significant-Gravitas/Auto-GPT GitHub - Significant-Gravitas/Auto-GPT: An experimental open-source attempt to make GPT-4 fully autonomous. An experimental open-source attempt to make GPT-4 fully autonomous. - GitHub - Significant-Gra..

Deep Learning/Transformers 2023. 4. 24. 16:43

이전 Prev 1 Next 다음

목록LLM (5)

운동하는 공대생

티스토리툴바