1
votes

I'm working for my master thesis and I'm pretty new to VHDL, but still I have to implement some complex things. This is one of the easiest structures I had to write, and still I'm encountering some problems.

It's a FSM implementing a 24bit shift register with an active-low sync signal (to program a DAC). It's just the end of a complex elaboration chain I created for my project. I followed the example model of a FSM as much as I could.

The behavioral simulation works fine, actually the whole elaboration chain I created works perfectly fine as far as the behavioral simulation concerns. However, once I try the Post-translate simulation things start to go wrong: lots of 'X' output signals.

With this simple shift register I DON'T get any 'X', however I can't get to the load_and_prepare_data phase. It seems that the current_state changes (by inspecting some signals), but the elaboration doesn't go on.

Please keep in mind that since I'm new to the language, I have no idea of what timing constraints I should set on this FSM (and I wouldn't know how to write them on the top.ucf anyway)

Can you see what's wrong? Thanks in advance

EDIT

I followed your advices and cleaned up the FSM by using a single state process. I still have some doubts about "where to put what" but I really like the new implementation. Anyway I now get a clean behavioral simulation but 'X' on all outputs in post translate simulation. What is causing this? I'll post the both the new code and the testbench:

----------------------------------------------------------------------------------
-- Company: 
-- Engineer: 
-- 
-- Create Date:    14:44:03 11/28/2014 
-- Design Name: 
-- Module Name:    dac_ad5764r_24bit_sr_programmer_v2 - Behavioral 
-- Project Name: 
-- Target Devices: 
-- Tool versions: 
-- Description: This is a PISO shift register that gets a 24bit parallel input word.
--              It outputs the 24bit input word starting from the MSB and enables
--              an active low ChipSelect line for 24 clock periods.
-- Dependencies: 
--
-- Revision: 
-- Revision 0.01 - File Created
-- Additional Comments: 
--
----------------------------------------------------------------------------------
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;

-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
use IEEE.NUMERIC_STD.ALL;

-- Uncomment the following library declaration if instantiating
-- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;

entity dac_ad5764r_24bit_sr_programmer_v2 is
    Port ( clk : in  STD_LOGIC;
           start : in  STD_LOGIC;
           reset : in  STD_LOGIC; -- Note that this reset is for the FSM not for the DAC
           reset_all_dac : in STD_LOGIC;
           data_in : in  STD_LOGIC_VECTOR (23 downto 0);
           serial_data_out : out  STD_LOGIC;
           sync_out : out  STD_LOGIC; -- This is a chip select
           reset_out : out STD_LOGIC;
           busy : out STD_LOGIC
         );
end dac_ad5764r_24bit_sr_programmer_v2;

architecture Behavioral of dac_ad5764r_24bit_sr_programmer_v2 is

-- Stati
type state_type is (idle, load_and_prepare_data, transmission);
--ATTRIBUTE ENUM_ENCODING : STRING; 
--ATTRIBUTE ENUM_ENCODING OF state_type: TYPE IS "001 010 100";
signal state: state_type := idle;
--signal next_state: state_type := idle;

-- Clock counter
--signal clk_counter_enable : STD_LOGIC := '0';
signal clk_counter : unsigned(4 downto 0) := (others => '0');

-- Shift register
signal stored_data: STD_LOGIC_VECTOR (23 downto 0) := (others => '0');

begin

FSM_single_process: process(clk)
begin
    if rising_edge(clk) then
        if reset = '1' then
            serial_data_out <= '0';
            sync_out <= '1';
            reset_out <= '1';
            busy <= '0';
            state <= idle;
        else
            -- Default
            serial_data_out <= '0';
            sync_out <= '1';
            reset_out <= '1';
            busy <= '0';

            case (state) is
                when transmission =>
                    serial_data_out <= stored_data(23);
                    sync_out <= '0';
                    busy <= '1';
                    clk_counter <= clk_counter + 1;
                    stored_data <= stored_data(22 downto 0) & "0";
                    state <= transmission;
                    if (clk_counter = 23) then
                        state <= idle;
                    end if;
                when others => -- Idle
                    if start = '1' then
                        serial_data_out <= data_in(23);
                        sync_out <= '0';
                        reset_out <= '1';
                        busy <= '1';
                        stored_data <= data_in;
                        clk_counter <= "00001";
                        state <= transmission;
                    end if;
            end case;

--            if (reset_all_dac = '1') then
--              reset_out <= '0';
--          end if;
        end if;
    end if;
end process;


end;

And the testbench:

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;

-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
--USE ieee.numeric_std.ALL;

ENTITY dac_ad5764r_24bit_sr_programmer_tb IS
END dac_ad5764r_24bit_sr_programmer_tb;

ARCHITECTURE behavior OF dac_ad5764r_24bit_sr_programmer_tb IS 

    -- Component Declaration for the Unit Under Test (UUT)

    COMPONENT dac_ad5764r_24bit_sr_programmer_v2
    PORT(
         clk : IN  std_logic;
         start : IN  std_logic;
         reset : IN  std_logic;
         data_in : IN  std_logic_vector(23 downto 0);
         serial_data_out : OUT  std_logic;
         reset_all_dac : IN std_logic;
         sync_out : OUT  std_logic;
         reset_out : OUT  std_logic;
         --finish : OUT  std_logic;
         busy : OUT  std_logic
        );
    END COMPONENT;


   --Inputs
   signal clk : std_logic := '0';
   signal start : std_logic := '0';
   signal reset : std_logic := '0';
   signal data_in : std_logic_vector(23 downto 0) := (others => '0');
   signal reset_all_dac : std_logic := '0';

    --Outputs
   signal serial_data_out : std_logic;
   signal sync_out : std_logic;
   signal reset_out : std_logic;
   --signal finish : std_logic;
   signal busy : std_logic;

   -- Clock period definitions
   constant clk_period : time := 100 ns;

BEGIN

    -- Instantiate the Unit Under Test (UUT)
   uut: dac_ad5764r_24bit_sr_programmer_v2 PORT MAP (
          clk => clk,
          start => start,
          reset => reset,
          data_in => data_in,
          reset_all_dac => reset_all_dac,
          serial_data_out => serial_data_out,
          sync_out => sync_out,
          reset_out => reset_out,
          --finish => finish,
          busy => busy
        );

   -- Clock process definitions
   clk_process :process
   begin
        clk <= '0';
        wait for clk_period/2;
        clk <= '1';
        wait for clk_period/2;
   end process;


   -- Stimulus process
   stim_proc: process
   begin        
      -- hold reset state for 100 ns.
      wait for clk_period*10;
      reset <= '1' after 25 ns;
      wait for clk_period*1;
      reset <= '0' after 25 ns;
      wait for clk_period*3; 
      reset_all_dac <= '1' after 25 ns;
      wait for clk_period*1;
      reset_all_dac <= '0' after 25 ns;
      wait for clk_period*5; 
      data_in <= "111111111111111111111111" after 25 ns;
      wait for clk_period*3;
        start <= '1' after 25 ns;
      wait for clk_period*1;
        start <= '0' after 25 ns;


      wait;
   end process;

END;

UPDATE 1

Updated with the last design: this code is not causing any 'X' (can't figure out why, this doesn't but the previous did). However it's not starting (in POST-TRANSLATE simulation) just like the first 3 process machine, and the signal sync_out is stuck at 0 while it should be '1' by default.

simulation 1

UPDATE 2

I've been looking into the tecnology schematic, starting from the problem of the sync_out=0: it's implemented with a FDS, S is the FSM reset signal, D is coming from a LUT3 with I = state&reset&start and INIT = 45 = "00101101". I've looked for this LUT3 in the simulation and I've noticed that it has INIT = "00000000"!

Is there something I'm missing about how to run this simulation? It seems that every LUT in the design have not been set!

UPDATE 3 It seems that the Post-Translate simulation is buggy in some way, or I'm not configuring it correctly for some reason: the Post-Map and the Post-PAR simulations work and display some outputs. However there is an odd bug: the stored_data register is not updated with the complete data_in vector, after that, the FSM operates correctly and outputs the data stored. I've looked in the tecnology schematic just after synthesis and for some reason the bits 23,22,21,19,18 are not connected to the corresponding data_in bit. You can see the effect in this screenshot from Post-Map simulation. Same happens in Post-PAR, but it seems that this problems comes directly from the synthesis! Stored_Data =/= Data_in

Solved: the strange output comes from the Synthesis optimization. The tool realized that the previous block in the elaboration chain will never output a bit different from 0 for those specific bit. My mistake was assuming that I could test the single block alone: what I was really testing was the block synthetized for the FPGA taking into account everything else in the design!

Thanks to everybody helped me, I'm going to follow your advices!

2
try putting the FSM next state into a sync process (i.e clocked), makes analysis easier.Sam Palmer
another suggestion, as I can't see anything glaringly obvious which is wrong in the VHDL. Have you made sure, that you are correctly driving the signals from the test bench or top level module, i.e start is not held on 1.Sam Palmer
also try clocking FSM_output_reg_process.Sam Palmer
Are you annotating the post-synthesis netlist with the SDF delay values? That can cause unexpected results when the delays on the clock tree aren't accounted for leading to invalid cycle slips and other simulation errors.Kevin Thibedeau
Thanks for the suggestions! @SamPalmer the testbench should be fine and it works in behavioral. Why should I clock those processes? AFAIK those processes have to be async so they can elaborate as soon as they have the inputs, then their outputs are used in clocked processes.FlyerDragon

2 Answers

2
votes

I prefer the single-process form of state machine, which is cleaner, simpler and much less prone to bugs like sensitivity-list errors. I would also endorse the points in Paebbels' excellent answer. However I don't think any of these are the problem here.

One thing to be aware of in post-synth and post-PAR simulations is that their model of time is different from the behavioural model. The behavioural model follows simple rules as I described in this answer and ensures that in a typical design flow you can go straight to hardware - without post-synth simulation, without worry.

Indeed I only use post-synth or post-PAR simulations if I'm chasing a suspected tool bug. (For FPGA designs, not ASIC, that is!)

However, that simple timing model has its limitations. You may be familiar with problems like a clock signal assigned via signal assignment (usually buried in a 3rd party model where you don't expect it) which consumes a delta cycle, and ensures that your clocked data arrives before your clock instead of after, and everything subsequently occurs one cycle earlier than intended...

In behavioural modelling, a little discipline will keep clear of such troubles. But the same is not true of post-PAR modelling.

Your testbench is probably set up the same way as the behavioural model. And if so, that is likely to be the problem.

Here's what I do in this situation : I claim no formal authority for it, just experience. It also works well when interfacing the FPGA to external memory models with realistic timings.

1) I assume the simple (behavioural) timing model works correctly for all signals INTERNAL to the design.

2) I assume nothing of the sort for inputs and outputs from the design.

3) I take note of the estimated setup and hold timings on the inputs, (a) from the FPGA datasheet or better, (b) from the worst case values shown in the post-synth or post-PAR report, and structure the testbench around them. Worked example : setup time 1 ns, hold time 2 ns, clock period 10 ns. This means that any input between 2 ns and 9 ns after a clock edge is guaranteed to be corrrectly read. I choose (arbitrarily) 5 ns.

signal_to_fpga <= driving_value after 5 ns;

(Note that Xilinx makes this absurdly counter-intuitive by expressing them as "offset in/out before/after" which refers timings to a previous or future clock edge instead of the one you're looking at)

Alternatively, if the input is fed from a CPU or memory in the real world, I use datasheet timing specifications for that device.

4) I take note of the worst case clk-out timing reported in the datasheet or report, and structure the design around them. (say, 7 ns)

fpga_output_pin <= driving_value after 7 ns;

Note that this "after" clause is obviously ignored by synthesis; however the post-synth back-annotation will introduce something very like it.
5) If this turns out to be not good enough, then (possibly in a wrapper component to avoid polluting the synthesisable code) improve accuracy like

fpga_output_pin <= 'X' after 1 ps, driving_value after 7 ns;

6) I re-run the behavioural simulation. Typically, it now fails, because it was written without realistic timings in mind.

7) I fix those failures. This may include adding realistic delays before testing values output from the design. It can be an iterative process.

Now, I have a reasonable expectation that the post-PAR simulation model will drop straight in to the testbench and work.

2
votes

Here are some hints to improve your code:

  1. You can remove the Xilinx dependencies to UNISIM, because you are not using any Xilinx Primitves.
  2. Applying attribute ENUM_ENCODING has no effect on state encoding unless you also define the attribute FSM_ENCODING and set it's value to user. One-Hot encoding can be forced by setting FSM_ENCODING to one-hot. Normally synthesis is smart enough to find the best encoding.
    read more ...
  3. None of your registers has a default value:
    signal current_state : state_type := idle;
  4. Your FSM is no FSM in the eyes of Xilinx synthesis tool (XST). I'm sure if you look into your synthesis report, you won't find that XST reports a FSM for current_state.
    So what's wrong with your FSM?
    • Your FSM has no initial state.
    • Your FSM has multiple reset states (idle, load_and_prepare_data)
    • Your FSM has no transition from idle to load_and_prepare_data (reset is no transition)
    • Writing next_state transitions for the current state can cause XST to think it's no FSM
      the default assignment next_state <= current_state; is sufficient.
  5. If you change the type of signal clk_counter to unsigned you can do arithmetic much easier.
    increment: clk_counter <= clk_counter + 1;
    clear: clk_counter <= (others => '0');
    compare: if (clk_counter = 23) then
  6. It's no good style to use the FSM's state signal outside of the FSM processes.

    FSM_next_state_process: process(current_state, start,  clk_counter, reset_all_dac)
    begin
      next_state  <= current_state;
    
      OutReg_busy        <= '1';
      OutReg_reset_out   <= '1';
      OutReg_sync_out    <= '1';
      clk_counter_enable <= '0';
    
      case (current_state) is
        when idle =>
          OutReg_busy      <= '0';
          if (reset_all_dac = '1') then
            OutReg_reset_out <= '0';
          end if;
    
        when load_and_prepare_data =>
          next_state <= transmission;
    
        when transmission =>
          clk_counter_enable <= '1';
          OutReg_sync_out <= '0';
    
          if (clk_counter = 23) then
            next_state <= idle;
          end if;
    
        when others =>
          next_state <= idle;
    
      end case ;
    end process;