Software And Hardware Co-Optimization For Deep Learning Algorithms On Fpga