The kind of feedback deep reinforcement learning agents useDeep reinforcement learning agents deal with sequential feedbackBut, if it isn’t sequential, what is it?Deep reinforcement learning agents deal with evaluative feedbackBut, if it isn’t evaluative, what is it?Deep reinforcement learning agents deal with sampled feedbackBut, if it isn’t sampled, what is it?Introduction to function approximation for reinforcement learningReinforcement learning problems can have high-dimensional state and action spacesReinforcement learning problems can have continuous state and action spacesThere are advantages when using function approximationNFQ: The first attempt at value-based deep reinforcement learningFirst decision point: Selecting a value function to approximateSecond decision point: Selecting a neural network architectureThird decision point: Selecting what to optimizeFourth decision point: Selecting the targets for policy evaluationFifth decision point: Selecting an exploration strategySixth decision point: Selecting a loss functionSeventh decision point: Selecting an optimization methodThings that could (and do) go wrongSummary