SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:DiVA.org:hj-63647"
 

Search: onr:"swepub:oai:DiVA.org:hj-63647" > Neuro-symbolic Visu...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Neuro-symbolic Visual Graph Question Answering with LLMs for language parsing

Bauer, Jakob Johannes (author)
Vienna University of Technology, Vienna, Austria
Eiter, Thomas (author)
Vienna University of Technology, Vienna, Austria
Ruiz, Nelson Higuera (author)
Vienna University of Technology, Vienna, Austria
show more...
Oetsch, Johannes (author)
Vienna University of Technology, Vienna, Austria
show less...
 (creator_code:org_t)
2023
2023
English.
  • Conference paper (peer-reviewed)
Abstract Subject headings
Close  
  • Images containing graph-based structures are an ubiquitous and popular form of data representation that, to the best of our knowledge, have not yet been considered in the domain of Visual Question Answering (VQA). We provide arespective novel dataset and present a modular neuro-symbolic approach as a first baseline. Our dataset extends CLEGR, an existing dataset for question answering on graphs inspired by metro networks. Notably, the graphs there are given in symbolic form, while we consider the more challenging problem of taking images of graphs as input. Our solution combines optical graph recognition for graph parsing, a pre-trained optical character recognition neural network for parsing node labels, and answer-set programming for reasoning. The model achieves an overall average accuracy of 73% on the dataset. While regular expressions are sufficient to parse the natural language questions, we also study various large-language models to obtain a more robust solution that also generalises well to variants of questions that are not part of the dataset. Our evaluation provides further evidence of the potential of modular neuro-symbolic systems, in particular with pre-trained models, to solve complex VQA tasks.

Subject headings

NATURVETENSKAP  -- Data- och informationsvetenskap -- Datavetenskap (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Computer Sciences (hsv//eng)

Keyword

neuro-symbolic computation
answer-set programming
visual question answering
large-language models

Publication and Content Type

ref (subject category)
kon (subject category)

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view