Paper:
Algorithm for Web Service Discovery Based on Information Retrieval Using WordNet and Linear Discriminant Functions
Ricardo Sotolongo, Carlos Kobashikawa, Fangyan Dong,
and Kaoru Hirota
Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology, G3-49, 4259 Nagatsuta, Midori-ku, Yokohama 226-8502, Japan
An algorithm based on information retrieval that applies the lexical database WordNet together with a linear discriminant function is proposed. It calculates the degree of similarity between words and their relative importance to support the development of distributed applications based on web services. The algorithm uses the semantic information contained in the Web Service Description Language specifications and ranks web services based on their similarity to the one the developer is searching for. It is applied to a set of 48 real web services in five categories, then compared them to four other algorithms based on information retrieval, showing an averaged improvement over all data between 0.6% and 1.9% in precision and 0.7% and 3.1% in recall for the top 15 ranked web services. The objective was to reduce the burden and time spent searching web services during the development of distributed applications, and it can be used as an alternative to current web service discovery systems such as brokers in the Universal Description, Discovery, and Integration (UDDI) platform.
- [1] XMLWeb Services.
[Online] http://webservices.xml.com/lpt/a/1292 . - [2] A. Dan, D. Davis, R. Kearney, A. Keller, R. King, D. Kuebler, H. Ludwig, M. Polan, M. Spreitzer, and A. Youssef, “Web services on demand: WSLA-driven automated management,” IBM Systems Journal, Vol.43, pp. 136-158, Jan. 2004.
- [3] N. Kokash, W. van den Heuvel, and V. D‘Andrea, “Leveraging Web Services Discovery with Customizable Hybrid Matching,” University of Trento, Jul. 2006.
- [4] World Wide Web Consortium, WSDL Specification.
[Online] March 15, 2001. http://www.w3.org/TR/wsdl.html . - [5] G. Salton, A. Wong, and C. S. Yang, “A vector space model for automatic indexing,” Communications of the ACM, Vol.18, No.11, pp. 613-620, ACM Press, Nov. 1975.
- [6] World Wide Web Consortium (W3C), W3C Semantic Web Activity.
[Online] http://www.w3.org/2001/sw/ . - [7] H. Takeda and I. Ohmukai, “Building Semantic Web Applications as Information/Knowledge Sharing Systems,” Workshop on End User Aspects of the Semantic Web, Colocated with European Semantic Web Conference (ESWC2005), 2005.
- [8] Universal Description, Discovery, and Integration.
[Online] http://uddi.xml.org/ . - [9] World Wide Web Consortium, Web Services Architecture.
[Online] February 11, 2004. http://www.w3.org/TR/ws-arch/ . - [10] Y. Wang and E. Stroulia, “Flexible Interface Matching for Web-Service Discovery,” Proc. of the Fourth Int. Conf. on Web Information Systems Engineering (WISE‘03), IEEE, pp. 147-156, Dec. 2003.
- [11] X. Dong, A. Halevy, J. Madhavan, E. Nemes, and J. Zhang, “Similarity Search for Web Services,” Proc. of the 30th VLDB Conference, Aug. 2004.
- [12] C. Faloutsos and D. Oard, “A survey of information retrieval and filtering methods,” University of Maryland at College Park, Aug. 1995.
- [13] Sajjanhar, J. Hou, and Y. Zhang, “Algorithm for web services matching,” Proc. APWeb, Vol.3007, pp. 665-670, Springer, 2004.
- [14] Y. Hao and Y. Zhang, “Web services discovery based on schema matching,” Proc. of the thirtieth Australasian conference on Computer science, Vol.62, pp. 107-113, 2007.
- [15] WordNet a lexical database for the English language. WordNet.
[Online] http://wordnet.princeton.edu/ . - [16] Y. Wang and E. Stroulia, “Semantic Structure Matching for Assessing Web-Service Similarity,” Springer, Vol.2910, pp. 194-207, 2003.
- [17] J. Zobel and A. Moffat, “Exploring the Similarity Space,” ACM SIGIR Forum, Vol.32, No.1, pp. 18-34, Apr. 1998.
- [18] R. Duda, P. Hart, and D. Stork, “Pattern Classification. Second Edition,” pp. 227-229, John Wiley & Sons, 2001.
- [19] XMethods.
[Online] http://www.xmethods.net/ve2/index.po .
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.