single-jc.php

JACIII Vol.28 No.2 pp. 239-254
doi: 10.20965/jaciii.2024.p0239
(2024)

Research Paper:

A Comparative Study of Relation Classification Approaches for Japanese Discourse Relation Analysis

Keigo Takahashi* ORCID Icon, Teruaki Oka* ORCID Icon, Mamoru Komachi** ORCID Icon, and Yasufumi Takama* ORCID Icon

*Graduate School of Systems Design, Tokyo Metropolitan University
6-6 Asahigaoka, Hino, Tokyo 191-0065, Japan

**Graduate School of Social Data Science, Hitotsubashi University
2-1 Naka, Kunitachi, Tokyo 186-8601, Japan

Received:
August 23, 2023
Accepted:
October 2, 2023
Published:
March 20, 2024
Keywords:
natural language processing, discourse relation analysis, special token, Japanese
Abstract

This paper presents a comparative analysis of classification approaches in the Japanese discourse relation analysis (DRA) task. In the Japanese DRA task, it is difficult to resolve implicit relations where explicit discourse phrases do not appear. To understand implicit relations further, we compared the four approaches by incorporating a special token to encode the relations of the given discourses. Our four approaches included inserting a special token at the beginning of a sentence, end of a sentence, conjunctive position, and random position to classify the relation between the two discourses into one of the following categories: CAUSE/REASON, CONCESSION, CONDITION, PURPOSE, GROUND, CONTRAST, and NONE. Our experimental results revealed that special tokens are available to encode the relations of given discourses more effectively than pooling-based approaches. In particular, the random insertion of a special token outperforms other approaches, including pooling-based approaches, in the most numerous CAUSE/REASON category in implicit relations and categories with few instances. Moreover, we classified the errors in the relation analysis into three categories: confounded phrases, ambiguous relations, and requiring world knowledge for further improvements.

Sp. token shows performance and robustness

Sp. token shows performance and robustness

Cite this article as:
K. Takahashi, T. Oka, M. Komachi, and Y. Takama, “A Comparative Study of Relation Classification Approaches for Japanese Discourse Relation Analysis,” J. Adv. Comput. Intell. Intell. Inform., Vol.28 No.2, pp. 239-254, 2024.
Data files:
References
  1. [1] B. Zhang, J. Su, D. Xiong, Y. Lu, H. Duan, and J. Yao, “Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition,” Proc. of the 2015 Conf. on Empirical Methods in Natural Language Processing, pp. 2230-2235, 2015. https://doi.org/10.18653/v1/D15-1266
  2. [2] L. Qin, Z. Zhang, and H. Zhao, “A Stacking Gated Neural Architecture for Implicit Discourse Relation Classification,” Proc. of the 2016 Conf. on Empirical Methods in Natural Language Processing, pp. 2263-2270, 2016. https://doi.org/10.18653/v1/D16-1246
  3. [3] Y. Ji and J. Eisenstein, “One Vector Is Not Enough: Entity-Augmented Distributed Semantics for Discourse Relations,” Trans. of the Association for Computational Linguistics, Vol.3, pp. 329-344, 2015. https://doi.org/10.1162/tacl_a_00142
  4. [4] H. Bai and H. Zhao, “Deep Enhanced Representation for Implicit Discourse Relation Recognition,” Proc. of the 27th Int. Conf. on Computational Linguistics, pp. 571-583, 2018.
  5. [5] N. Kim, S. Feng, C. Gunasekara, and L. Lastras, “Implicit Discourse Relation Classification: We Need to Talk About Evaluation,” Proc. of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5404-5414, 2020. https://doi.org/10.18653/v1/2020.acl-main.480
  6. [6] L. T. Nguyen, L. Van Ngo, K. Than, and T. H. Nguyen, “Employing the Correspondence of Relations and Connectives to Identify Implicit Discourse Relations via Label Embeddings,” Proc. of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4201-4207, 2019. https://doi.org/10.18653/v1/P19-1411
  7. [7] R. Prasad, A. Joshi, N. Dinesh, A. Lee, E. Miltsakaki, and B. Webber, “The Penn Discourse TreeBank as a Resource for Natural Language Generation,” 2005.
  8. [8] R. Prasad, N. Dinesh, A. Lee, E. Miltsakaki, L. Robaldo, A. Joshi, and B. Webber, “The Penn Discourse TreeBank 2.0.,” Proc. of the Sixth Int. Conf. on Language Resources and Evaluation (LREC’08), 2008.
  9. [9] R. Prasad, B. Webber, A. Lee, and A. Joshi, “The Penn Discourse Treebank 3.0,” 2019. https://doi.org/10.35111/QEBF-GK47
  10. [10] Y. Kishimoto, Y. Murawaki, and S. Kurohashi, “Adapting BERT To Implicit Discourse Relation Classification with a Focus on Discourse Connectives,” Proc. of the 12th Language Resources and Evaluation Conf., pp. 1152-1158, 2020.
  11. [11] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding,” Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171-4186, 2019. https://doi.org/10.18653/v1/N19-1423
  12. [12] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” CoRR, arXiv:1907.11692, 2019. https://doi.org/10.48550/arXiv.1907.11692
  13. [13] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov, “Unsupervised Cross-Lingual Representation Learning at Scale,” Proc. of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8440-8451, 2020. https://doi.org/10.18653/v1/2020.acl-main.747
  14. [14] A. Radford and K. Narasimhan, “Improving Language Understanding by Generative Pre-Training,” Preprint, 2018.
  15. [15] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language Models are Unsupervised Multitask Learners,” OpenAI Blog 1, No.8, 2019.
  16. [16] M. Kurfalı and R. Östling, “Let’s Be Explicit About That: Distant Supervision for Implicit Discourse Relation Classification via Connective Prediction,” Proc. of the 1st Workshop on Understanding Implicit and Underspecified Language, 2021. https://doi.org/10.18653/v1/2021.unimplicit-1.1
  17. [17] S. Dutta, J. Juneja, D. Das, and T. Chakraborty, “Can Unsupervised Knowledge Transfer From Social Discussions Help Argument Mining?,” Proc. of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 7774-7786, 2022. https://doi.org/10.18653/v1/2022.acl-long.536
  18. [18] F. Jiang, Y. Fan, X. Chu, P. Li, and Q. Zhu, “Not Just Classification: Recognizing Implicit Discourse Relation on Joint Modeling of Classification and Generation,” Proc. of the 2021 Conf. on Empirical Methods in Natural Language Processing, pp. 2418-2431, 2021. https://doi.org/10.18653/v1/2021.emnlp-main.187
  19. [19] C. Yu, H. Zhang, Y. Song, and W. Ng, “CoColm: Complex Commonsense Enhanced Language Model with Discourse Relations,” Findings of the Association for Computational Linguistics: ACL 2022, pp. 1175-1187, 2022. https://doi.org/10.18653/v1/2022.findings-acl.93
  20. [20] Y. Liu and S. Li, “Recognizing Implicit Discourse Relations via Repeated Reading: Neural Networks with Multi-Level Attention,” Proc. of the 2016 Conf. on Empirical Methods in Natural Language Processing, pp. 1224-1233, 2016. https://doi.org/10.18653/v1/D16-1130
  21. [21] H. Ruan, Y. Hong, Y. Xu, Z. Huang, G. Zhou, and M. Zhang, “Interactively-Propagative Attention Learning for Implicit Discourse Relation Recognition,” Proc. of the 28th Int. Conf. on Computational Linguistics, pp. 3168-3178, 2020. https://doi.org/10.18653/v1/2020.coling-main.282
  22. [22] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems, Vol.30, 2017.
  23. [23] K. Omura and S. Kurohashi, “Improving Commonsense Contingent Reasoning by Pseudo-Data and Its Application to the Related Tasks,” Proc. of the 29th Int. Conf. on Computational Linguistics, pp. 812-823, 2022.
  24. [24] Z. Zhong and D. Chen, “A Frustratingly Easy Approach for Entity and Relation Extraction,” Proc. of the 2021 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 50-61, 2021. https://doi.org/10.18653/v1/2021.naacl-main.5
  25. [25] Y. Xie, L. Xing, W. Peng, and Y. Hu, “IIE-NLP-Eyas at SemEval-2021 Task 4: Enhancing PLM for ReCAM with Special Tokens, Re-Ranking, Siamese Encoders and Back Translation,” Proc. of the 15th Int. Workshop on Semantic Evaluation (SemEval-2021), pp. 199-204, 2021. https://doi.org/10.18653/v1/2021.semeval-1.22
  26. [26] S. Yavuz, K. Hashimoto, W. Liu, N. S. Keskar, R. Socher, and C. Xiong, “Simple Data Augmentation with the Mask Token Improves Domain Adaptation for Dialog Act Tagging,” Proc. of the 2020 Conf. on Empirical Methods in Natural Language Processing (EMNLP), pp. 5083-5089, 2020. https://doi.org/10.18653/v1/2020.emnlp-main.412
  27. [27] S. Takase and N. Okazaki, “Multi-Task Learning for Cross-Lingual Abstractive Summarization,” Proc. of the 13th Language Resources and Evaluation Conf., pp. 3008-3016, 2022.
  28. [28] A. M. Dai and Q. V. Le, “Semi-supervised Sequence Learning,” Vol.28, 2015.
  29. [29] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. Rush, “Transformers: State-of-the-Art Natural Language Processing,” Proc. of the 2020 Conf. on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38-45, 2020. https://doi.org/10.18653/v1/2020.emnlp-demos.6
  30. [30] F. Moiseev, Z. Dong, E. Alfonseca, and M. Jaggi, “Skill: Structured Knowledge Infusion for Large Language Models,” Proc. of the 2022 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1581-1588, 2022. https://doi.org/10.18653/v1/2022.naacl-main.113

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Apr. 22, 2024