인문학
사회과학
자연과학
공학
의약학
농수해양학
예술체육학
복합학
지원사업
학술연구/단체지원/교육 등 연구자 활동을 지속하도록 DBpia가 지원하고 있어요.
커뮤니티
연구자들이 자신의 연구와 전문성을 널리 알리고, 새로운 협력의 기회를 만들 수 있는 네트워킹 공간이에요.
논문 기본 정보
- 자료유형
- 학술저널
- 저자정보
- 발행연도
- 2026.6
- 수록면
- 846 - 852 (7page)
- DOI
- 10.5302/J.ICROS.2026.26.0013
이용수
초록· 키워드
The safety of autonomous driving systems depends on their ability to perceive the surrounding environment accurately. Additionally, instance segmentation, which identifies individual objects, is a core-enabling technology. Recently, studies using vision language models (VLMs), such as language-guided image segmentation (LISA)—which segment target objects based on natural language instructions—have demonstrated the potential to overcome the limitations of conventional methods that recognize only fixed classes. However, there remains room for improvement in how these models exploit more effectively the semantic understanding capabilities of VLMs for segmentation tasks. Motivated by this observation, this study proposes a method to enhance VLM training by incorporating an auxiliary caption loss into the LISA architecture. Based on this approach, we aim to improve instance segmentation performance in complex scenarios such as autonomous driving. The proposed approach encourages the model to learn segmentation instructions and caption information that captures the image’s global context, enabling the VLM to establish deeper associations between visual features and linguistic semantics. The effectiveness of the proposed method is validated experimentally.
상세정보 수정요청해당 페이지 내 제목·저자·목차·페이지정보가 잘못된 경우 알려주세요!
목차
- Abstract
- I. 서론
- II. 관련 연구
- III. 제안 방법
- IV. 실험 결과
- V. 결론
- REFERENCES