Stm32f4:generating parallel signals with the FSMC
The goal: the memory controller can be used to generate a "generic" 16-bit parallel data stream with clock. Address generation is disregarded, as well as the other control signals dedicated to memory chips.
It must is noted that the stm32f40x and stm32f41x has the FSMC (static memories), while thestm32f42 x and stm32f43x have the FMC (static and dynamic Memories). The differences between the concern the support of SDRAM (dynamic RAM), address and data write FIFOs (both Da TA and address, instead of data only for FSMC, and 16-word long instead of 2-word long only for FSMC), and the 32-bit wide Data bus for FMC (see [1]).
Set Pins (1st attempt)
Only data bus fsmc_d[15:0] and clock fsmc_clk would be used (set as alternate function). The other pins is set as standard GPIOs (generalPurpose output).
FSMC is alternate function , according to the datasheet (see "Table 9. Alternate function Mapping "in [2]).
/*pd:0, 1, 3, 8, 9, ten, +, alternate function (0B10)*/Gpiod->moder =0xa56a559a; Gpiod->afr[0] =0xCCCCCCCC;/*FSMC = AF12 (0xC)*/Gpiod->afr[1] =0xCCCCCCCC;/*Pe:7, 8, 9, ten, one, A, A, a, alternate function (0B10)*/Gpioe->moder =0xaaaa9555; Gpioe->afr[0] =0xCCCCCCCC; Gpioe->afr[1] =0xCCCCCCCC;
FSMC Setup/init (1st attempt)
Be careful of the The Wicked Register map Documentation of the FSMC block:
This was very misleading, since all other table was ordered as found in memory, and not here.
/*PSRAM, synchronous (Burst), non-multiplexed*//*Control Register*/Fsmc_bank1->btcr[0] = FSMC_BCR1_CBURSTRW | Fsmc_bcr1_waitpol | Fsmc_bcr1_bursten | Fsmc_bcr1_mwid_0 | Fsmc_bcr1_wren | Fsmc_bcr1_mtyp_0/*PSRAM*/|Fsmc_bcr1_mbken;/*Timing Register*/Fsmc_bank1->btcr[1] = Fsmc_btr1_clkdiv_1/*Div 3*/;
It is noticable that the timing be all set to 0, except the clock.
Result (1st attempt)
The code writing to the FSMC are using an array and simulate a sequencial memory request, in order to take advantage of the Burst mode.
volatile uint16_t* FSMC = (uint16_t*)0x60000000; for (uint32_t i=0; i< (sizeof(bitstream_bin)/2); i++) { = ((uint16_t*) Bitstream_bin) [i]; = W;}
The clock is ~54mhzand the maximum clock is HCLK/2 = 168/2=84mhz. Unfortunately, my oscilloscope is too slow for this.
At least, 4 clock cycles is required to write one data. Data Latency (datlat lowest value is 2). There is one cyle to give the address, and the cyle of latency, one cyle for give the data.
At max FSMC speed (~84mhz), after dividing the clock is 4, the 16-bit parallel transmission would only be ~20mhz.
Bursts is possible up to + bits (both 16-bit data words). When using this feature, the data words was send for each address, hence more data was sent, but the clock was hard-to-use: 3 ticks for the (empty) address, 1 ticks for the first data, 1 ticks for the second data (5 cycles for 2 data, ~30mhz max).
Set pins (2nd attempt)
/*pd:0, 1, 8, 9, ten, alternate function (0B10)*/Gpiod->moder =0xa56a555a; Gpiod->afr[0] =0xCCCCCCCC;/*FSMC = AF12 (0xC)*/Gpiod->afr[1] =0xCCCCCCCC;/*Pe:7, 8, 9, ten, one, A, A, a, alternate function (0B10)*/Gpioe->moder =0xaaaa9555; Gpioe->afr[0] =0xCCCCCCCC; Gpioe->afr[1] =0xCCCCCCCC;/*Pb:7, AF*/Gpiob->moder =0x55551555; Gpiob->afr[0] =0xCCCCCCCC; Gpiob->afr[1] =0xCCCCCCCC;
FSMC Setup/init (2nd attempt)
/*NOR Flash, asynchronous, multiplexed*//*Control Register*/Fsmc_bank1->btcr[0] = Fsmc_bcr1_wren | Fsmc_bcr1_faccen | Fsmc_bcr1_mwid_0/*16-bit*/| Fsmc_bcr1_mtyp_1/*NOR Flash*/| Fsmc_bcr1_muxen |Fsmc_bcr1_mbken;/*Timing Register*/Fsmc_bank1->btcr[1] = Fsmc_btr1_clkdiv_0 | Fsmc_btr1_datast_0 | Fsmc_btr1_addhld_0 | fsmc_btr1_addset_1;
Result (2nd attempt)
We want to use the nadv signal as a new clock CLK.
volatile uint16_t* FSMC = (uint16_t*)0x60000000= { 0xFFFF 0x00000xFFFF0x0000, 0xFFFF0x00000xFFFF 0x0000 }; for (uint32_t i=0;i<8; i++) { fsmc[0] = w[i];}
We write to the same address in order to force a new memory transaction and cycle Nadv.
The problem is, the data bus is updated after the positive edge of the nadv "Clock". This issue can is overcome by multiplexing the address and data bus and put the data value as address. The addset value is also increased in order to has a more balanced clock (addset=3).
for (uint32_t i=0;i<8; i++) { = w[i]; = V;}
unfortately, the overall clock speed decreased because the address "trick".
Conclusion
A "nice looking" 16-bit parallel signal with clock can be generated at approx. 16MHz using the memory controller (FSMC) in asynchronous NOR Flash mode. 20MHz can is achieved with a external clock divider (div 4) in synchronous PSRAM mode. If the clock edge can be aligned with the data edge, the 27MHz are possible from SRAM.
Note:the FMC (Flexible Memory Controller, also supporting SDRAM) in SDRAM mode can generate a synchronous burst of one DA Ta per clock. In the case, the 84MHz is possible in theory. I Haven ' t the hardware to test it.
Stm32f4:generating parallel signals with the FSMC