AWS도 어려워 하는 손 그림 자동 태깅

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

반업주부의 일상 배움사

AWS도 어려워 하는 손 그림 자동 태깅 :: BLIP2 by Salesforce 본문

IT 인터넷/GCP

AWS도 어려워 하는 손 그림 자동 태깅 :: BLIP2 by Salesforce

Banjubu 2023. 2. 15. 11:11

지난번엔 Text to Image를 해봤는데요.

2023.02.13 - [IT 인터넷/GCP] - Openjourney로 마음껏 그림 생성하기 :: 허깅페이스(HuggingFace) 모델 사용 예제

Openjourney로 마음껏 그림 생성하기 :: 허깅페이스(HuggingFace) 모델 사용 예제

오픈저니(Openjourney)는 미드저니(Midjourney)의 오픈소스 버전이에요. GCP에서 윈도우 서버를 생성했어요. (이게 거의 최저 사양, VSCode 띄우면 메모리 부족함) 2 vCPU + 7.5 GB memory US$88.97 2 NVIDIA T4 US$540.20

banjubu.tistory.com

이번엔 BLIP2 모델을 이용해서 이미지를 분석한 후 자동으로 태그를 생성해주는걸 해볼게요. (Image to Text)

BLIP2가 좋은건 사진은 말할 것도 없고, 그림도 꽤 잘 인식한다는거에요. AWS의 Rekognition을 사용하면 Drawing, Paint 같은 것만 나오거든요.

https://huggingface.co/spaces/Salesforce/BLIP2

BLIP2 - a Hugging Face Space by Salesforce

huggingface.co

BLIP2 모델을 돌리기 위해 GCP에 인스턴스를 만들어요. (이 정도가 낮은 사양)

64 vCPU + 416 GB memory	US$3,543.31
4 NVIDIA T4	US$1,080.40
10GB 분산된 영구 디스크	US$1.30
사용 할인	-US$1,387.11
Total	US$3,237.90

모델을 이용하려면 LAVIS가 필요해요.

$ pip install salesforce-lavis

설치중에 pycocotools 관련 에러를 만나면 (GIT 필요):

$ pip3 install "git+https://github.com/philferriere/cocoapi.git#egg=pycocotools&subdirectory=PythonAPI"

실행 코드:

import torch
from PIL import Image
from IPython.display import display
from lavis.models import load_model_and_preprocess

device = torch.device("cuda") if torch.cuda.is_available() else "cpu"
raw_image = Image.open("./sample.jpg").convert("RGB")
model, vis_processors, _ = load_model_and_preprocess(name="blip2_t5", model_type="pretrain_flant5xxl", is_eval=True, device=device)
image = vis_processors["eval"](raw_image).unsqueeze(0).to(device)
result = model.generate({"image": image, "prompt": "Question: Could you give me tags of the image? Answer:"})
print(result)

결과:

['elf, elfin, fantasy art']

['girl, watermelon, eat']

LIST

저작자표시 비영리 변경금지

'IT 인터넷 > GCP' 카테고리의 다른 글

Openjourney로 마음껏 그림 생성하기 :: 허깅페이스(HuggingFace) 모델 사용 예제 (0)	2023.02.13
[GCP] 오토스케일링과 로드밸런싱 :: Auto Scaling & Load Balancing (0)	2022.08.17
[GCP] 고정 IP 사용 :: Compute Engine (0)	2022.08.10
[GCP] 도메인 구매와 DNS 설정하기 :: Domain and DNS (0)	2022.08.04
[GCP] Compute Engine 생성하기 :: 서버 한 대 추가요 (0)	2022.08.03