A data-driven ensemble framework for modeling high-dimensional data : theory, methods, algorithms, and applications