In terms of implementation, I have created 4 different heuristics functions for my Capstone that take in a state and output a value based on how good the state is. with 4 being inadmissible. Technically 5 as 2 heuristics are additions to a single heuristic.
Modified my admissible heuristic function so it outputs a move (left=-1,right=+1,up=-4,down=+4) based on what move a search algorithm would have performed if it is was taking an action in that state.
Created my training agent function that outputs a file contains vectors of these heuristic outputs. One per state
Did some research into activation functions and Neural Network types to figure out what initial design I should go with for my Network.