TY - GEN
T1 - A fast structured regression for large networks
AU - Zhou, Fang
AU - Ghalwash, Mohamed
AU - Obradovic, Zoran
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016
Y1 - 2016
N2 - Structured regression has been successfully used in many applications where explanatory and response variables are inter-correlated, such as in weighted attributed networks. One of structured models, Gaussian Conditional Random Fields (GCRF), utilizing multiple unstructured models to learn the non-linear relationships between node attributes and the structured response variable, achieves high prediction accuracy. However, it does not scale well with large networks. We propose a novel model, called Scalable Approximate GCRF (SA-GCRF), which integrates weighted attributed network compression with GCRF, with the aim of making GCRF applicable to large networks. The model consists of three steps: first, it compresses a network into a smaller one by generalizing nodes into supernodes and edges into superedges; then, it applies GCRF to the reduced network; and finally, it unfolds the predicted response variables into the original nodes. Our hypothesis is that the reduced network maintains most information of the original network such that the loss in prediction accuracy obtained by GCRF on the reduced network is minor. The comprehensive experimental results indicate that SA-GCRF was 150-520 times faster than standard GCRF and 11-29 times faster than state-of-the-art UmGCRF on large networks, and provided regression results where GCRF and UmGCRF were not applicable. Furthermore, SA-GCRF achieved a similar regression accuracy, 0.76, to the one obtained from the original real-world weighted attributed citation network, even after compressing the network to 10% of its size.
AB - Structured regression has been successfully used in many applications where explanatory and response variables are inter-correlated, such as in weighted attributed networks. One of structured models, Gaussian Conditional Random Fields (GCRF), utilizing multiple unstructured models to learn the non-linear relationships between node attributes and the structured response variable, achieves high prediction accuracy. However, it does not scale well with large networks. We propose a novel model, called Scalable Approximate GCRF (SA-GCRF), which integrates weighted attributed network compression with GCRF, with the aim of making GCRF applicable to large networks. The model consists of three steps: first, it compresses a network into a smaller one by generalizing nodes into supernodes and edges into superedges; then, it applies GCRF to the reduced network; and finally, it unfolds the predicted response variables into the original nodes. Our hypothesis is that the reduced network maintains most information of the original network such that the loss in prediction accuracy obtained by GCRF on the reduced network is minor. The comprehensive experimental results indicate that SA-GCRF was 150-520 times faster than standard GCRF and 11-29 times faster than state-of-the-art UmGCRF on large networks, and provided regression results where GCRF and UmGCRF were not applicable. Furthermore, SA-GCRF achieved a similar regression accuracy, 0.76, to the one obtained from the original real-world weighted attributed citation network, even after compressing the network to 10% of its size.
UR - https://www.scopus.com/pages/publications/85015160159
U2 - 10.1109/BigData.2016.7840594
DO - 10.1109/BigData.2016.7840594
M3 - 会议稿件
AN - SCOPUS:85015160159
T3 - Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
SP - 106
EP - 115
BT - Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
A2 - Joshi, James
A2 - Karypis, George
A2 - Liu, Ling
A2 - Hu, Xiaohua Tony
A2 - Ak, Ronay
A2 - Xia, Yinglong
A2 - Xu, Weijia
A2 - Sato, Aki-Hiro
A2 - Rachuri, Sudarsan
A2 - Ungar, Lyle
A2 - Yu, Philip S.
A2 - Govindaraju, Rama
A2 - Suzumura, Toyotaro
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE International Conference on Big Data, Big Data 2016
Y2 - 5 December 2016 through 8 December 2016
ER -