Design and Implementation of an Interface to the Nintendo Entertainment System Controller

Abstract

This article will cover the basic design and Verilog implementation for a programmable logic device designed to interface with a Nintendo Entertainment Controller (NES). In this implementation, we’ll implement a design on the Altera UP2 Education Board, using the on-board seven-segment displays and VGA output. Design issues that will be covered will include timing the correct clock and latch pulses to interface appropriately with the NES controller, designing a state machine to allow for functionality related to sequential button-pushes, and interfacing with the VGA output.


In designing this project, it is most important to consider both the original requirements, and the protocols and limitations of the technology at hand. We begin by examining the functionality of the NES controller, the VGA controller, and coming up with an overall design concept for the implementation of each of the tasks associated with these devices.

With the requirements addressed, the protocols for the devices written, and the design completed, it’s time to implement the verilog functions.

module nintendoController(clk,ctrlClock,ctrlLatch,ctrlData,buttons,easterEgg,stateDisplay,color);

  output [0:7] stateDisplay;

  parameter Left=8'b11111101;
  parameter Right=8'b11111110;

  output color;
  reg [0:3] color;

  parameter pollDelay = 555;

  input clk, ctrlData;
  output ctrlClock, ctrlLatch, easterEgg;
  output [0:7] buttons;
  reg ctrlLatch, ctrlClock;
  reg [0:7] buttons;
  reg [0:7] oldButtons;
  reg [0:10] counter;
  reg [0:7] persistance;
  reg [0:7] lag;

  initial counter = 0;
  initial persistance = 0;
  initial lag = 0;
  initial buttons = 8'b11111111;
  initial oldButtons = 8'b11111111;
  integer i;

  wire [0:3] temporaryState;
  reclock(clk,clock);
  easterEggStateMachine(buttons,persistance == 10,(lag == 255 && ~easterEgg),easterEgg,temporaryState);
  BCD7Seg(stateDisplay,temporaryState);

  always @(negedge clock) begin
    counter = (counter == (pollDelay + 18)) ? 0 : counter + 1;
    ctrlLatch = (counter == pollDelay | counter == pollDelay + 1) ? 1 : 0;
    ctrlClock = (counter > pollDelay + 1 && ~counter%2) ? 1 : 0;
    if(counter == 0) begin
      if(buttons == oldButtons) begin
        persistance = (buttons != 255) ? persistance + 1 : 0;
        lag = (buttons == 255) ? lag + 1 : 0;
        if(persistance == 10 && easterEgg)
          color <= (buttons == Left) ? color - 1 : (buttons == Right) ? color + 1 : color;
      end else begin
        persistance = 0;
        oldButtons <= buttons;
      end
    end
  end

  always @(posedge clock)
    for(i = 0; i < 8; i = i + 1)
      if((counter - pollDelay)/2 == i)
      buttons[i] = ctrlData;

endmodule

The nintendoController module covers the entirety of the interface with the NES controller. In order to do this, it declares three additional modules, reclock, easterEggStateMachine, and BCD7Seg. The onboard clock is reclocked to a lesser frequency, giving it a period of approximately 6 microseconds. The easterEggStateMachine is used to manage the entirity of the state machine, allowing for functionality of sequential button-presses.

However, the nintendoController module, because it has direct access to the clock, does track the persistance of a button-press to determine whether it’s considered a full press or a misfire. (If a player were to press two buttons, then release them, it’s likely that one would be released at a time marginally prior to the other. Rather than consider this a seperate button-push, it’s best to only record the presses once a certain level of consistancy has been established. Along the same vein, lag is tracked. This value is a representation of the amount of time since the last button release. It is this factor which requires the state machine to be reset, should the user delay too long between button-presses. Finally, the BCD7Seg module simply displays the hex value of the state when supplied a two-digit binary value.

The clock is set to a period of six microseconds to allow sampling of the data line at a time that is precisely between the negative and positive edges of the clock. This ensures that the data sampled is indeed the button desired. This does also mean that the latch and clock pulses sent to the nintendoController must be fabricated from a smaller clock, which can be seen in the always block occuring on a negative clock edge. Part of this logic mandates that the arbitrary pollDelay parameter must be an odd number. In Figure 2, “ nintendoController Waveform Test ”, the parameter was set to a smaller value of five cycles between sampling.


///////////////////////////////////////////////////////////////////////////////
// DEMO FOR SIMPLE VGA CONTROLLER                                            //
// Author  : Aaron Egier                                                     //
// Date    : Nov 26, 2004                                                    //
// Version : 1.1                                                             //
//                                                                           //
// Based on the VHDL version by Deshanand Singh.                             //
//                                                                           //
// This program is a simple demo that demonstrates the functionality of the  //
// simple VGA controller. On power-up, it displays an image of a mailbox     //
// spanning the entire screen. Push the button FLEX_PB2 on the UP2 board to  //
// display the flashing lines.                                               //
//                                                                           //
// Comments/Suggestions/Problems/Improvements                                //
// -> email aegier@eecg.toronto.edu                                          //
///////////////////////////////////////////////////////////////////////////////
// MODIFIED BY:                                                              //
// Kevin Gisi								     //
// May, 2009 								     //
// It no longer displays a mailbox, but rather will write a single pixel     //
// when "go" is sent a positive signal, at the supplied row and column with  //
// the given color.							     //
///////////////////////////////////////////////////////////////////////////////

module vgaController(clock, resetn, go, hsync, vsync, data_r, data_g, data_b, row, column, color);

  input clock, resetn, go;
  output hsync, vsync, data_r, data_g, data_b;

  wire reset;
  wire enable_cnt;
  wire reset_cnt;
  wire done_screen;
  wire write_request;
  wire write_allowed;
  wire [4:0] timer;
  input [5:0] row, column;
  input [0:3] color;

  parameter sReset      = 2'b00;
  parameter sWritePixel = 2'b01;
  parameter sDone       = 2'b10;

  reg go_sync;
  reg [1:0] cstate, nstate;

  assign reset = ~resetn;

  // The timer uses the done_screen signal which occurs every 16.6ms
  // to cause the displayed line to flash every 1/2 second or so
  lpm_counter timer1( .clock(clock), .aclr(reset), .q(timer), .cnt_en(done_screen) );
  defparam timer1.lpm_width = 5;

  // Instantiate the VGA cntroller
  vgacon vga( clock, resetn, row, column, color + 1, write_request, write_allowed,
  			hsync, vsync, data_r, data_g, data_b, done_screen );

  // Make the push button synchronous with the main clock
  always @( posedge clock)
    go_sync <= go;

  // State machine register
  always @( posedge clock or negedge resetn )
    begin
      if( resetn == 1'b0 ) begin
        cstate <= sReset;
      end else begin
        cstate <= nstate;
      end
    end

  // Combinational part of the state machine. Figure out the
  // next state from the current state.
  always @( cstate or go_sync or write_allowed )
    begin
      case( cstate )
      sReset :
        if( go_sync == 1'b1 ) begin
          nstate <= sWritePixel;
	end else begin
          nstate <= sReset;
        end

      sWritePixel :
        if( write_allowed == 1'b1) begin
          nstate <= sDone;
        end else begin
          nstate <= sWritePixel;
        end
      sDone :
        nstate <= go_sync ? sWritePixel : sDone;
      endcase
    end

  // Outputs of the state machine

  // The counter increments on the next cycle only if the controller
  // allowed the request to write a SuperPixel
  assign enable_cnt  = ( cstate == sWritePixel && write_allowed == 1'b1 ) ? 1'b1 : 1'b0;

  // Request to write a SuperPixel whenever we are in the WritePixel state
  assign write_request = ( cstate == sWritePixel ) ? 1'b1 : 1'b0;

  // Reset xy_count when we are in the Done state
  assign reset_cnt   = ( cstate == sDone ) ? 1'b1 : 1'b0;

endmodule
-------------------------------------------------------------------------------
-- SIMPLE VGA CONTROLLER WITH 8 COLOUR SUPPORT                               --
-- Author      : Deshanand Singh                                             --
-- Modified by : Aaron Egier                                                 --
-- Date        : Nov 26, 2004                                                --
-- Version     : 1.1                                                         --
--                                                                           --
-- This file is the VHDL source for a SIMPLE VGA CONTROLLER. The controller  --
-- currently contains enough memory to maintain a 64x64 resolution and eight --
-- colours for each pixel. The controller enables external devices to write  --
-- pixels into the memory, while creating the signals necessary for display  --
-- on a standard VGA/SVGA monitor. Since only a 64x64 resolution is supported--
-- by the memory while the monitor supports 640x480 pixels, each one of the  --
-- pixels in memory is mapped to a 10x8 block of real pixels called a        --
-- superpixel.                                                               --
--                                                                           --
-- Comments/Suggestions/Problems/Improvements                                --
-- -> email aegier@eecg.toronto.edu                                          --
-------------------------------------------------------------------------------
library ALTERA;
use ALTERA.maxplus2.all;

library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use IEEE.std_logic_unsigned.all;

library LPM;
use LPM.lpm_components.ALL;

entity vgacon is
    generic
        (
          ramfile       : string := "UNUSED"
        );
    port( clock         : in  std_logic; -- 25.175 MHz board clock
          resetn        : in  std_logic; -- Active low reset line

          -- The row, column, and colour for the pixel to be written
          row, column   : in  std_logic_vector( 5 downto 0 );
          colour        : in  std_logic_vector( 2 downto 0 );

          write_request : in  std_logic; -- Request to write the colour to (row,column) coordinates
          write_allowed : out std_logic; -- The write will complete on next positive edge of clock

          hsync, vsync  : out std_logic; -- Horizontal and vertical VGA sync lines
          data_r        : out std_logic; -- Red data line
          data_g        : out std_logic; -- Green data line
          data_b        : out std_logic; -- Blue data line

          done_screen   : out std_logic  -- Done drawing one screen
        );
end vgacon;

architecture ComponentLevel of vgacon is
   signal Reset                                 : std_logic;

   signal EnableVert_din, EnableVert            : std_logic;
   signal ResetVert_din, ResetVert              : std_logic;
   signal CounterHoriz, CounterVert             : std_logic_vector( 9 downto 0 );

   signal CounterTen                            : std_logic_vector( 3 downto 0 );
   signal EnableCol_din, EnableCol              : std_logic;

   signal CounterCol, CounterRow                : std_logic_vector( 5 downto 0 );

   signal RAMreadAddress , RAMwriteAddress_din,
          RAMwriteAddress, RAMaddress_din,
          RAMaddress                            : std_logic_vector(11 downto 0 );

   signal WriteStart                            : std_logic;
   signal WriteControl                          : std_logic_vector( 1 downto 0 );

   signal Reading_din, Reading                  : std_logic;

   signal HorizValid_pipe1, VertValid_pipe1,
          ViewValid_pipe2, ViewValid_pipe3      : std_logic;

   signal PixelDataIn, PixelDataIn_intermediate,
          PixelDataOut_pipe3                    : std_logic_vector( 2 downto 0 );

   signal DataR_din, DataG_din, DataB_din,
          hsync_din, vsync_din                  : std_logic;

   signal High                                  : std_logic;
   signal Low2                                  : std_logic_vector( 1 downto 0 );
   signal Low4                                  : std_logic_vector( 3 downto 0 );

   --
   -- Constants that define the timing for the VGA sync signals.
   -- Don't change these unless you are absolutely sure of how the
   -- VGA timing works!. The monitor could be damaged when the
   -- timing is incorrect.
   ---
   constant C_VERT_NUM_PIXELS  : integer := 480;
   constant C_VERT_SYNC_START  : integer := 493;
   constant C_VERT_SYNC_END    : integer := 494;
   constant C_VERT_TOTAL_COUNT : integer := 525;

   constant C_HORZ_NUM_PIXELS  : integer := 640;
   constant C_HORZ_SYNC_START  : integer := 659;
   constant C_HORZ_SYNC_END    : integer := 755;
   constant C_HORZ_TOTAL_COUNT : integer := 800;

begin

   -- Various Constants

   High  <= '1';
   Low4  <= "0000";
   Low2  <= "00";

   Reset <= NOT Resetn;

   -- Counter Enables and Resets

   EnableVert_din <= '1' when CounterHoriz = (C_HORZ_TOTAL_COUNT-2) else '0';
   dff1: dff PORT MAP ( d => EnableVert_din, q => EnableVert, clk => Clock,
                        clrn => Resetn, prn => High );

   ResetVert_din  <= '1' when ( EnableVert_din = '1' and
                                CounterVert= (C_VERT_TOTAL_COUNT-1) ) else '0';

   dff2: DFF PORT MAP ( d => ResetVert_din , q => ResetVert , clk => Clock,
                        clrn => Resetn, prn => High );

   -- Horizontal and Vertical counters, which keep allow us to send out the pixel
   -- data and sync signals at the correct time.

   Horizontal: lpm_counter
               GENERIC MAP ( lpm_width   => 10 )
               PORT    MAP ( clock       => Clock,
                             aclr        => Reset,
                             sclr        => EnableVert,     -- Reset Horiz on next line
                             q           => CounterHoriz );

   Vertical:   lpm_counter
               GENERIC MAP ( lpm_width   => 10 )
               PORT    MAP ( clock       => Clock,
                             aclr        => Reset,
                             sclr        => ResetVert,
                             q           => CounterVert,
                             cnt_en      => EnableVert );

   -- Every SuperPixel that is shown on the screen is made up of 10 horizontal pixels
   -- so we must keep track of the column number in addition to the pixel number.
   -- The following increments the column counter for every 10 pixels encountered.
   Count10:    lpm_counter
               GENERIC MAP ( lpm_width   =>  4 )
               PORT    MAP ( clock       => Clock,
                             aclr        => Reset,
                             sclr        => EnableCol,      -- Move to new column, so reset
                             q           => CounterTen );   -- ten pixel counter.

   EnableCol_din <= '1' when CounterTen="1000" else '0';
   dff3: DFF PORT MAP ( d => EnableCol_din, q => EnableCol, clk => Clock,
                        clrn => Resetn, prn => High );

   CountCol:   lpm_counter
               GENERIC MAP ( lpm_width   => 6 )
               PORT    MAP ( clock       => Clock,
                             aclr        => Reset,
                             sclr        => EnableVert,     -- Next line, reset column counter
                             q           => CounterCol,
                             cnt_en      => EnableCol );          

   -- Every SuperPixel shown is made up of 8 vertical pixels. No counter circuitry is req
   -- since dividing by 8 is equivalent to shifting right by 3 bits. Thus only the following
   -- line is used to keep track of the row number.

   CounterRow <= CounterVert(8 downto 3);

   -- Are we in the the Signal Range where we should send out pixels ?
   -- The follwing calculates the if the we should actually send out pixels
   -- or if the r,g,b lines should be set to '0'. Three cycles of latency
   -- are added to the output ViewValid_pipe3. The FFs are placed between the
   -- the combinational units rather than in a SR configuration so that the
   -- path delays are reduced as much as possible.
   --
   process (Clock, Reset)
   begin
      if Reset = '1' then

         HorizValid_pipe1 <= '0';
         VertValid_pipe1  <= '0';
         ViewValid_pipe2  <= '0';
         ViewValid_pipe3  <= '0';

      elsif Clock'Event and Clock='1' then

         if CounterHoriz  >= 0 and CounterHoriz < C_HORZ_NUM_PIXELS then
            HorizValid_pipe1 <= '1';
         else
            HorizValid_pipe1 <= '0';
         end if;

         if CounterVert   >= 0 and CounterVert  < C_VERT_NUM_PIXELS then
            VertValid_pipe1  <= '1';
         else
            VertValid_pipe1  <= '0';
         end if;

         ViewValid_pipe2 <= HorizValid_pipe1 and VertValid_pipe1;
         ViewValid_pipe3 <= ViewValid_pipe2;
      end if;
   end process;           

   -- When are we Reading ?

   Reading_din <= '1' when ( CounterHoriz <= (C_HORZ_NUM_PIXELS +20) or
                             CounterHoriz >= (C_HORZ_TOTAL_COUNT-20) ) else '0';

   dff5: dff PORT MAP ( d => Reading_din, q => Reading, clk => Clock,
                        prn => Resetn, clrn => High );

   -- RAM read address. The address in the RAM which contains the current SuperPixel
   -- being displayed.

   RAMreadAddress( 11 downto 6 ) <= CounterRow; -- upper six bits contain the row
   RAMreadAddress(  5 downto 0 ) <= CounterCol; -- lower six bits contain the column

   -- RAM write address. The address where ther user wishes to place a SuperPixel.

   RAMwriteAddress_din(11 downto 6 ) <= row;
   RAMwriteAddress_din( 5 downto 0 ) <= column;

   reg1: lpm_ff
         GENERIC MAP ( lpm_width => 12 )
         PORT    MAP ( data      => RAMwriteAddress_din,
                       q         => RAMwriteAddress,
                       clock     => Clock,
                       aclr      => Reset );

   -- The WRITE Memory controller.
   -- Write to memory when we get a request and we are not reading the
   -- memory to display pixels.

   WriteStart <= '1' when ( write_request='1' and Reading='0' ) else '0';

   -- Currently a write request is processed in one cycle when we are in
   -- writing mode. Thus as soon as the request is made the write_allowed
   -- ack is sent back.

   write_allowed <= NOT Reading; 

   -- Register the WriteStart Signal with 2 cycles of latency.

   srg1: lpm_shiftreg
         GENERIC MAP ( lpm_width => 2 )
         PORT    MAP ( clock     => Clock,
                       data      => Low2,
                       aclr      => Reset,
                       shiftin   => WriteStart,
                       q         => WriteControl );               

   -- Register the colour input with 2 cycles of latency. This is necessary since the
   -- (row,column) address experiences two cycles of latency before getting to the
   -- synchronous RAM.
   --
   -- Similiar to the original version except that we shift three bits at a time now.
   srg2: process (Clock, Reset)
   begin
      if Reset='1' then

         PixelDataIn_intermediate <= "000";
         PixelDataIn              <= "000";

      elsif Clock'Event and Clock = '1' then

         PixelDataIn_intermediate <= colour;
         PixelDataIn              <= PixelDataIn_intermediate;

      end if;
   end process;

   -- Multiplexer that selects between the reading and writing address. The signal
   -- is also registered before it is sent to the RAM.

   RAMaddress_din <= RAMreadAddress WHEN WriteControl(0)='0' ELSE RAMwriteAddress;

   reg2: lpm_ff
         GENERIC MAP ( lpm_width => 12 )
         PORT    MAP ( data      => RAMaddress_din,
                       q         => RAMaddress,
                       clock     => Clock,
                       aclr      => Reset );

   -- The VIDEO RAM :-) . Two of the EABs in the FLEX10K20 will be used to implement
   -- the 4096 bits of ram used for creating a two colour display with 64x64 resolution.
   --
   -- I've now seperated the RAM into 3 distinct banks for each colour. One could just
   -- change the width to 3, but I want to keep the memory file format consistent with
   -- the last version.
   --
   VidRam00: lpm_ram_dq
             GENERIC MAP ( lpm_widthad         => 12,              -- 64x64 SuperPixel grid
                           lpm_outdata         => "REGISTERED",    -- register the output
                           lpm_indata          => "REGISTERED",    --   and input as well
                           lpm_address_control => "REGISTERED",    --   as add/cont lines.
                           lpm_file            => ramfile,
                           lpm_width           => 1 )
             PORT    MAP ( data     => PixelDataIn(0 downto 0),
                           address  => RAMaddress,
                           we       => WriteControl(1),
                           q        => PixelDataOut_pipe3(0 downto 0),
                           inclock  => Clock,
                           outclock => Clock );

   VidRam01: lpm_ram_dq
             GENERIC MAP ( lpm_widthad         => 12,              -- 64x64 SuperPixel grid
                           lpm_outdata         => "REGISTERED",    -- register the output
                           lpm_indata          => "REGISTERED",    --   and input as well
                           lpm_address_control => "REGISTERED",    --   as add/cont lines.
                           lpm_file            => ramfile,
                           lpm_width           => 1 )
             PORT    MAP ( data     => PixelDataIn(1 downto 1),
                           address  => RAMaddress,
                           we       => WriteControl(1),
                           q        => PixelDataOut_pipe3(1 downto 1),
                           inclock  => Clock,
                           outclock => Clock );

   VidRam10: lpm_ram_dq
             GENERIC MAP ( lpm_widthad         => 12,              -- 64x64 SuperPixel grid
                           lpm_outdata         => "REGISTERED",    -- register the output
                           lpm_indata          => "REGISTERED",    --   and input as well
                           lpm_address_control => "REGISTERED",    --   as add/cont lines.
                           lpm_file            => ramfile,
                           lpm_width           => 1 )
             PORT    MAP ( data     => PixelDataIn(2 downto 2),
                           address  => RAMaddress,
                           we       => WriteControl(1),
                           q        => PixelDataOut_pipe3(2 downto 2),
                           inclock  => Clock,
                           outclock => Clock );

   -- Send out the Data. The data is ANDed with the valid signal. This ensures
   -- that data is only sent out when displaying a pixel. The r,g,b lines should
   -- be '0' at all other times.

   DataR_din <= ViewValid_pipe3 AND PixelDataOut_pipe3( 2 );
   dff8: dff port map ( d => DataR_din, q => data_r, clk => Clock,
                        clrn => Resetn, prn => High ); -- data_r -> 4 cycles of latency

   DataG_din <= ViewValid_pipe3 AND PixelDataOut_pipe3( 1 );
   dff9: dff port map ( d => DataG_din, q => data_g, clk => Clock,
                        clrn => Resetn, prn => High ); -- data_g -> 4 cycles of latency

   DataB_din <= ViewValid_pipe3 AND PixelDataOut_pipe3( 0 );
   dffA: dff port map ( d => DataB_din, q => data_b, clk => Clock,
                        clrn => Resetn, prn => High ); -- data_b -> 4 cycles of latency

   -- Send out the Sync Signals
   -- Note: A four bit shift register is a placed between these signals
   --       and the output. It is needed to match the latency of the r,g,b
   --       signals. The shift reg FFs can propbably be used for retiming,
   --       but for simplicity, it is just left as a 4 bit SR.
   --
   hsync_din <= '0' when ( CounterHoriz >= C_HORZ_SYNC_START and
                           CounterHoriz <= C_HORZ_SYNC_END ) else '1';
   srg3: lpm_shiftreg
         GENERIC MAP ( lpm_width => 4 )
         PORT    MAP ( clock     => Clock,
                       data      => Low4,
                       aset      => Reset,
                       shiftin   => hsync_din,
                       shiftout  => hsync );

   vsync_din <= '0' when ( CounterVert  >= C_VERT_SYNC_START and
                           CounterVert  <= C_VERT_SYNC_END ) else '1';

   sr4: lpm_shiftreg
         GENERIC MAP ( lpm_width => 4 )
         PORT    MAP ( clock     => Clock,
                       data      => Low4,
                       aset      => Reset,
                       shiftin   => vsync_din,
                       shiftout  => vsync );

   -- The done screen signal

   done_screen <= ResetVert;

end ComponentLevel;
-------------------------------------------------------------------------------

Search