« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

반업주부의 일상 배움사

[요약] GestureDiffuCLIP: CLIP 잠재력이 있는 제스처 디퓨전 모델 :: with AI 본문

IT 인터넷/일반

[요약] GestureDiffuCLIP: CLIP 잠재력이 있는 제스처 디퓨전 모델 :: with AI

Banjubu 2023. 3. 29. 08:44

> English Summary

[ 요약 ]

제스처 디퓨전 클립 모델은 시맨틱 인식 메커니즘과 클립 기반 인코더를 활용하여 음성 오디오 및 텍스트 대본에서 공동 음성 제스처를 생성합니다.
이 시스템은 또한 비디오 또는 모션 시퀀스와 같은 입력 양식을 스타일 설명자로 사용할 수 있습니다.
이 모델은 유연하고 창의적인 결과를 위해 텍스트 프롬프트의 스타일 표현을 디퓨전 모델에 주입합니다.
자세한 내용은 연구 논문을 참조하세요.
시각화 결과는 "매일 밤 싸움이 있었어요", "자격증 시험을 일주일 앞두고 더 이상 참을 수 없었어요"와 같은 텍스트 프롬프트를 기반으로 제스처를 성공적으로 생성한 것을 보여줍니다. 독자들은 이 기술과 애니메이션 및 가상 현실과 같은 분야에서의 잠재적 활용에 대해 자세히 알아볼 수 있습니다.
이 기사는 배우 톰 행크스와의 다양한 인터뷰에서 인용한 내용을 정리한 것입니다.
그는 아내와의 만남, 시끄러운 이웃과의 관계, 평생의 우정의 중요성 등 다양한 개인적인 경험에 대해 이야기합니다.
행크스는 삶이 영원히 지속되기를 바라는 마음을 표현하며 개개인의 개성과 체형의 가치를 강조합니다.
독자들은 자신의 관계와 친구에 대해 감사한 점을 되돌아보도록 권장합니다.
이 글에서는 디퓨전 모델을 사용한 소음 결합, 사찰에서의 자원봉사, iMovie 앱 사용, 음악 감상 또는 다큐멘터리 시청 등 다양한 주제에 대해 설명합니다.
또한 입력 스타일 프롬프트를 변경하고 효과적인 입력 스타일을 제어하여 시간에 따른 다양한 스타일 제어를 달성하는 방법도 언급하고 있습니다.
이 앱은 앱 스토어 또는 Apple iOS가 설치된 휴대폰에서 사용할 수 있습니다.
매개변수 s가 클수록 더 큰 스타일 효과를 나타냅니다.
독자는 iMovie 앱을 다운로드하여 음악을 듣거나 다큐멘터리를 보면서 자유 시간을 즐길 수 있습니다.
이 문서에서는 프레임워크를 사용하여 다양하고 사실적인 음성 제스처를 만드는 방법과 이 프레임워크가 공동 음성 제스처를 향상시키는 데 어떻게 유용할 수 있는지에 대해 설명합니다.
이 프레임워크는 OpenAI의 ChatGPT와 같은 대규모 언어 모델을 사용하여 자동화할 수 있습니다.
이 문서에서는 이 프레임워크를 사용하여 농담을 하거나 음식을 설명하는 등 다양한 스타일과 목적에 맞는 공동 음성 제스처를 생성하는 예제를 제공합니다.
이 글은 독자가 프레임워크를 직접 사용해보고 다양한 스타일과 프롬프트를 실험하여 자신만의 공동 음성 제스처를 향상시킬 수 있다고 제안하며 마무리합니다.
화자는 음성 입력을 위해 음성 도구를 사용할 것을 제안하지만 합성된 음성은 훈련에 포함되지 않습니다.
여행은 목적지보다는 여정에 관한 것이라고 언급하지만, 비명을 지르는 아이와 함께 몇 시간 동안 공항에 갇혀 있을 때는 예외입니다.
화자는 외모는 뛰어나지 않지만 사랑받는 꼽추 친구에 대해 이야기합니다.
화자는 이 친구와 주말을 함께 보냈고, 친구가 중학교에서 영어 교사로 일한다고 밝힌 카페에 함께 갔습니다.
화자는 쉬는 날에는 음악을 듣거나 다큐멘터리를 보거나 잠을 자는 것을 즐깁니다.

GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents

https://www.youtube.com/watch?v=Psi1IOZGq8c

[ Summary ]

The Gesture Diffuse Clip model generates co-speech gestures from speech audio and text transcripts by utilizing a semantic-aware mechanism and a clip-based encoder.
The system also allows for input modalities such as video or motion sequences as style descriptors.
The model infuses style representations from text prompts into the diffusion model for flexible and creative results.
For more information, refer to the research paper.
The visualization results show successful generation of gestures based on text prompts such as "every night there was a fight" and "about a week before the certification exam I couldn't stand it anymore." Readers can learn more about the technology and its potential uses in fields such as animation and virtual reality.
The article is a compilation of quotes from different interviews with actor Tom Hanks.
He talks about various personal experiences such as meeting his wife, dealing with noisy neighbors, and the importance of lifelong friendships.
Hanks expresses a desire for life to go on forever and emphasizes the value of individual personalities and body types.
The reader is encouraged to reflect on their own relationships and what they appreciate about their friends.
The article discusses various topics such as combining noise using the diffusion model, volunteering at a Buddhist temple, using the iMovie app, and listening to music or watching documentaries.
It also mentions achieving time varied style control by changing the input style prompt and controlling the effective input style.
The app is available on the app store or on phones with an Apple iOS.
The larger s parameter indicates more significant style effect.
The reader can download the iMovie app and enjoy free time by listening to music or watching documentaries.
This article discusses using a framework to create diverse and realistic gestures for speech, and how it can be useful for enhancing co-speech gestures.
The framework can be automated using a large language model, such as ChatGPT from OpenAI.
The article provides examples of using the framework to generate co-speech gestures with different styles and for different purposes, such as telling a joke or describing food.
The article concludes by suggesting that readers can try out the framework for themselves and experiment with different styles and prompts to enhance their own co-speech gestures.
The speaker suggests using a speech tool for speech input, but synthesized voices are not included in training.
They mention the idea of traveling being about the journey rather than the destination, except when stuck in an airport for hours with a screaming toddler.
They talk about a friend with a hunchback who is not the best looking but is beloved.
The speaker spent a weekend with this friend and they went to a cafe where the friend shared that they work as an English teacher at a middle school.
The speaker enjoys listening to music, watching documentaries, or sleeping on free days.
No specific action is recommended for the reader.

LIST

저작자표시 비영리 변경금지

'IT 인터넷 > 일반' 카테고리의 다른 글

[요약] 실제로 코딩을 배우는 방법... 2023년을 위한 7가지 로드맵 :: with AI (0)	2023.03.29
[요약] Flutter 여행 앱 코딩 대결: ChatGPT vs 인간 :: with AI (0)	2023.03.29
[요약] 미드저니 벡터 아트: 미드저니(AI 아트)를 사용하여 사진을 벡터로 변환하는 방법 :: with AI (0)	2023.03.29
[요약] 피그마 미드저니 애니메이션 튜토리얼 - 패러럭스 효과 + mp4 :: with AI (0)	2023.03.29
[요약] AI03: 미드저니, 피그마, ChatGPT로 웹사이트 구축하기 :: with AI (0)	2023.03.29